Dot products
Traditionally, dot products are something that's introduced really early on in a linear algebra course, typically right at the start, so it might seem strange that I've pushed them back this far in the series. I did this because there's a standard way to introduce the topic, which requires nothing more than a basic understanding of vectors, but a fuller understanding of the role that dot products play in math can only really be found under the light of linear transformations. Before that though, let me just briefly cover the standard way that dot products are introduced, which I'm assuming is at least partially review for a number of viewers.
Numerically, if you have two vectors of the same dimension, two lists of numbers with the same lengths, taking their dot product means pairing up all of the coordinates, multiplying those pairs together, and adding the result. So the vector
So when two vectors are generally pointing in the same direction, their dot product is positive. When they're perpendicular, meaning the projection of one onto the other is the zero vector, their dot product is zero. And if they point in generally the opposite direction, their dot product is negative.
Now this interpretation is weirdly asymmetric. It treats the two vectors very differently. So when I first learned this, I was surprised that order doesn't matter.
You could instead project v onto w, multiply the length of the projected v by the length of w, and get the same result. I mean, doesn't that feel like a really different process? Here's the intuition for why order doesn't matter. If v and w happened to have the same length, we could leverage some symmetry.
Since projecting w onto v, then multiplying the length of that projection by the length of v, is a complete mirror image of projecting v onto w, then multiplying the length of that projection by the length of w. Now if you scale one of them, say v, by some constant like
So the overall effect is still to just double the dot product. So even though symmetry is broken in this case, the effect that this scaling has on the value of the dot product is the same under both interpretations. There's also one other big question that confused me when I first learned this stuff.
Why on earth does this numerical process of matching coordinates, multiplying pairs, and adding them together have anything to do with projection? Well, to give a satisfactory answer, and also to do full justice to the significance of the dot product, we need to unearth something a little bit deeper going on here, which often goes by the name duality. But before getting into that, I need to spend some time talking about linear transformations from multiple dimensions to one dimension, which is just the number line. These are functions that take in a 2D vector and spit out some number.
But linear transformations are of course much more restricted than your run-of-the-mill function with a 2D input and a 1D output. As with transformations in higher dimensions, like the ones I talked about in chapter
Otherwise, if there's some line of dots that gets unevenly spaced, then your transformation is not linear. As with the cases we've seen before, one of these linear transformations is completely determined by where it takes i-hat and j-hat. But this time, each one of those basis vectors just lands on a number, so when we record where they land as the columns of a matrix, each of those columns just has a single number.
This is a
A consequence of linearity is that after the transformation, the vector will be
Since we're just looking at numerical expressions right now, going back and forth between vectors and
Let me show an example that clarifies the significance, and which just so happens to also answer the dot product puzzle from earlier. Unlearn what you have learned, and imagine that you don't already know that the dot product relates to projection. What I'm going to do here is take a copy of the number line and place it diagonally in space somehow, with the number
Now think of the two-dimensional unit vector whose tip sits where the number
If we project 2D vectors straight onto this diagonal number line, in effect, we've just defined a function that takes 2D vectors to numbers. What's more, this function is actually linear, since it passes our visual test that any line of evenly spaced dots remains evenly spaced once it lands on the number line. Just to be clear, even though I've embedded the number line in 2D space like this, the outputs of the function are numbers, not 2D vectors.
You should think of a function that takes in two coordinates and outputs a single coordinate. But that vector u-hat is a two-dimensional vector, living in the input space. It's just situated in such a way that overlaps with the embedding of the number line.
With this projection, we just defined a linear transformation from 2D vectors to numbers, so we're going to be able to find some kind of 1x2 matrix that describes that transformation. To find that 1x2 matrix, let's zoom in on this diagonal number line setup and think about where i-hat and j-hat each land, since those landing spots are going to be the columns of the matrix. This part's super cool.
We can reason through it with a really elegant piece of symmetry. Since i-hat and u-hat are both unit vectors, projecting i-hat onto the line passing through u-hat looks totally symmetric to projecting u-hat onto the x-axis. So when we ask, what number does i-hat land on when it gets projected, the answer's going to be the same as whatever u-hat lands on when it's projected onto the x-axis.
But projecting u-hat onto the x-axis just means taking the x-coordinate of u-hat. So by symmetry, the number where i-hat lands when it's projected onto that diagonal number line is going to be the x-coordinate of u-hat. Isn't that cool? The reasoning is almost identical for the j-hat case.
Think about it for a moment. For all the same reasons, the y-coordinate of u-hat gives us the number where j-hat lands when it's projected onto the number line copy. Pause and ponder that for a moment, I just think that's really cool.
So the entries of the 1x2 matrix describing the projection transformation are going to be the coordinates of u-hat. And computing this projection transformation for arbitrary vectors in space, which requires multiplying that matrix by those vectors, is computationally identical to taking a dot product with u-hat. This is why taking the dot product with a unit vector can be interpreted as projecting a vector onto the span of that unit vector and taking the length.
So what about non-unit vectors? For example, let's say we take that unit vector u-hat, but we scale it up by a factor of
We had a linear transformation from 2D space to the number line, which was not defined in terms of numerical vectors or numerical dot products, it was just defined by projecting space onto a diagonal copy of the number line. But because the transformation is linear, it was necessarily described by some 1x2 matrix. And since multiplying a 1x2 matrix by a 2D vector is the same as turning that matrix on its side and taking a dot product, this transformation was inescapably related to some 2D vector.
The lesson here is that any time you have one of these linear transformations whose output space is the number line, no matter how it was defined, there's going to be some unique vector v corresponding to that transformation, in the sense that applying the transformation is the same thing as taking a dot product with that vector. To me, this is utterly beautiful. It's an example of something in math called duality.
Duality shows up in many different ways and forms throughout math, and it's super tricky to actually define. Loosely speaking, it refers to situations where you have a natural but surprising correspondence between two types of mathematical thing. For the linear algebra case that you just learned about, you'd say that the dual of a vector is the linear transformation that it encodes.
And the dual of a linear transformation from some space to one dimension is a certain vector in that space. So to sum up, on the surface, the dot product is a very useful geometric tool for understanding projections and for testing whether or not vectors tend to point in the same direction. And that's probably the most important thing for you to remember about the dot product.
But at a deeper level, dotting two vectors together is a way to translate one of them into the world of transformations. Again, numerically this might feel like a silly point to emphasize. It's just two computations that happen to look similar.
But the reason I find this so important is that throughout math, when you're dealing with a vector, once you really get to know its personality, sometimes you realize that it's easier to understand it not as an arrow in space, but as the physical embodiment of a linear transformation. It's as if the vector is really just a conceptual shorthand for a certain transformation, since it's easier for us to think about arrows in space rather than moving all of that space to the number line. In the next video, you'll see another really cool example of this duality in action as I talk about the cross product.