“Send me your rotation

let's focus on good definitions...

Today, I'll explore a seemingly simple question: what is a rotation? I'm asking for the definition of a rotation, which is important because before we can say anything useful about rotations, (it should go without saying,) we need to know what rotations are. Definitions are an integral part of math and whenever I get stuck while reading a paper, I can usually trace the issue to a misunderstanding of some definition. You've also probably heard someone say that math deals with abstractions. Well, definitions are where these abstractions take place. Definitions are mathematicians' attempts to boil ideas/questions/systems down to their essence; the less moving parts there are to think about, the easier it is understand what is going on. There is still another advantage: the less things involved in a definition, the more applicable the definition becomes. So when I ask for the definition of a rotation, try to think of the most general description that can be applied to other settings, e.g., in three/four/five dimensions.

Fig 1. 2D and 3D Rotations

So let's start with a naive definition. In the plane, 2D, rotation involves fixing a point, called the fixed point or centre of rotation, and then rotating the rest of the points some specified angle. We already have a problem! This is a circular definition as I'm using “rotating” to define “rotation” and don't you hate it when dictionaries do that? So how do we fix this? Notice that every point (except the fixed point) lies on a unique circle centred at the fixed point. “Rotating a point” means moving counterclockwise along this circle a distance proportional to the radius of the circle; the proportionality constant is known also as the angle of rotation. This definition works but it's kinda wordy and we'll soon find that it's not the simplest way to define a rotation.

What happens in 3-space? Naively, rotation involves fixing a straight line, the axis of rotation, and then rotating the rest of the points. Okay, blah blah circular definition blah blah. But every point not on the axis of rotation lies on a unique plane perpendicular to the axis of rotation. We know how to rotate planes (previous paragraph) and so we rotate these parallel planes the same angle about (the intersections with) the axis. We need to make sure we have a consistent way of deciding what it means to move counterclockwise on these planes. This amounts to choosing which side represents the “top” of the planes. The definition is no longer circular and is instead recursive since rotating planes had to be defined first. Just as before, this definition works but can be greatly simplified.

Fig 2. rotation + rotation = rotation?
If these convoluted definitions were not enough to convince you to search for a better definition, here's an interesting property of rotations: if you have two intersecting axes of rotation and you rotate about one then the other, the overall effect on your space (the composition of rotations) is also a rotation. This is not obvious from the lengthy definition of rotations -- why are all the conditions in the definition still satisfied? That is: why is there a fixed straight line and planes perpendicular to it are all rotated by the same angle about the fixed line? Part of the difficulty lies in the fact the definition is sort of hiding the “real” information needed to prove the property.

In both cases, 2D and 3D, you can directly verify that rotating preserves distances between points: if two points are 3 units apart, they remain 3 units apart after rotating. Actions that preserve distances are known as isometries. Thus rotations are isometries and it is easy to verify that the composition of isometries is itself an isometry. A second property of rotations that is hard to verify with the given definition but is still very intuitive is orientation-preservation: a rotated left hand will always look like a left hand. In contrast, reflections are isometries but are not orientation-preserving: a reflected left hand looks like a right hand. Lastly, when you compose two rotations with intersecting axis, then this intersection, which is fixed by each rotation, is a fixed point of the composition. So we have that the composition of two rotations with intersecting axes is an orientation-preserving isometry with a fixed point. In 1775, Euler used spherical geometry to prove his rotation theorem :

Euler. Any orientation-preserving isometry of 3-space with a fixed point has a fixed axis.

You can directly verify that: orientation-preserving isometries of the plane with fixed points are rotations. This combined with Euler's rotation theorem allows us to extend to 3D: orientation-preserving isometries of 3-space with fixed points are rotations. In both 2D and 3D, being a rotation is the same as (equivalent to) being an orientation-preserving isometry with a fixed point. This characterization is a lot simpler than talking about concentric circles and parallel planes and will be our new definition.

Definition 1. A rotation is an orientation-preserving isometry with a fixed point.

This definition says nothing about the plane or 3-space and can thus be used to define rotation in higher dimensions. Just imagine how difficult it would be to generalize the previous definitions to higher dimensions. In fact, you would have been in a little trouble if you had defined 4D-rotation as having a fixed plane (plane of rotation?) about which “perpendicular” planes are rotated (analogous to the original definition for 3D-rotations): in this case, we lose the property that the composition of rotations with intersecting planes of rotation is itself a 4D-rotation. Under our new definition, a rotation will not necessarily have a fixed plane of rotation. If it has one, it will be called a simple rotation. The definition applies to higher dimensions with one caveat: we need a proper definition of what it means to be orientation-preserving in these spaces; I will return to this in a bit when I discuss the connection to matrices.

Fig 3. Left: a hyperbolic rotation according to new definition. Right: a hyperbolic rotation according to original definition

Fig. 3 is my last example of why the new definition is better. This whole time I've been talking about rotating the plane/3-space without actually specifying what geometry I am using! You most probably assumed I was talking about the Euclidean plane/3-space and you were right. But what if I want to deal with non-Euclidean spaces like the hyperbolic plane/3-space? In this case, the original definition breaks one of the nice properties. If you “rotate” the hyperbolic plane by moving points along concentric circles distances proportional to the radii, then you no longer have an isometry: some points are stretched apart while others shrunk closer together. This is because the angle in the hyperbolic plane is the ratio of a circular arc's length to the hyperbolic sine of the radius. On the other hand, an angle in the Euclidean plane is the ratio of a circular arc's length to just the radius. Fortunately, the new definition of rotation doesn't actually care if we're talking about hyperbolic or euclidean space -- it's space-agnostic, if you will.

The downside of this definition is it doesn't help when describing a particular rotation. Based on the definition, one would think that a fixed point is the only thing you need to differentiate rotations; but what if two rotations share a fixed point? And what can we say about their composition? Since I want to focus on these questions, all rotations mentioned from here on out will share a fixed point, call it the origin. For 2D-rotations, we can differentiate rotations (that fix the origin) by their angles of rotation and the sum of these angles describes their composition. For 3D-rotations, we can differentiate rotations by the axes and angles of rotation but it is difficult to determine axis/angle of rotation for their composition. So what other description can we use?

Fortunately for us, any isometry of Euclidean space (thought of as a vector space) that fixes the origin is a linear map (can you show why?) and linear maps can be described by matrices. Fix a cartesian coordinate system for your space, then the images of the standard unit vectors under the isometry give you the unique columns of the matrix. Since isometries preserve distances (and angles), the columns of the matrix are perpendicular/orthogonal unit vectors and you have an orthogonal matrix. This also allows us to give a general definition of orientation-preserving:

Definition 2. An isometry of Euclidean space (that fixes the origin) is orientation-preserving (hence a rotation) if the determinant of the corresponding orthogonal matrix is \( +1. \) Such a matrix is called a special orthogonal matrix.

This correspondence goes the other way as well. Any (special) orthogonal matrix defines a unique linear map that is an (orientation-preserving) isometry of Euclidean space that fixes the origin (a rotation). Thus, we have a bijection between the group of rotations about the origin and the special orthogonal group, \( SO(n) \). Even better, composing rotations corresponds to matrix multiplication. So we in fact have a group isomorphism! Special orthogonal matrices are a nice description of rotations since: 1) we know how to multiply matrices to find the description of the composition of rotations; 2) it generalizes to any number of dimensions. In fact, this description is good enough to be an alternative definition for rotations (of Euclidean space that fix the origin) but I won't do that (due to the previous parathentical limitation).

How does the matrix description compare to the angle of rotation description? In 2D, a rotation with an angle rotation \( \theta \) sends the point \( (1,0) \mapsto (\cos \theta, \sin \theta) \) and \( (0,1) \mapsto (-\sin \theta, \cos \theta) \). So the rotation corresponds to the matrix: \( \begin{pmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix} \). Previously, I said the composition of two 2D-rotations can be described by just adding the angles of rotation. This manifests itself in matrices through the following identity: \[ \begin{pmatrix}\cos \alpha & -\sin \alpha \\ \sin \alpha & \cos \alpha \end{pmatrix}\begin{pmatrix}\cos \beta & -\sin \beta \\ \sin \beta & \cos \beta \end{pmatrix} = \begin{pmatrix}\cos(\alpha + \beta) & -\sin(\alpha + \beta) \\ \sin(\alpha + \beta) & \cos(\alpha + \beta) \end{pmatrix} \] and this is just the angle-sum formulas in disguise.

In 3D, it's not as straight-forward to switch from axis/angle of rotation description to a matrix or vice-versa. First, I'll do this for the rotation depicted in Fig. 2 and then I'll give an outline what to do in general. The rotation in Fig. 2 sends the \( x \)-axis to the \( y \)-axis, the \( y \)-axis to the \( z \)-axis, and the \( z \)-axis to the \( x \)-axis. This means it is corresponds to the matrix \( M = \begin{pmatrix}0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} \). For the axis/angle-description:

  • the picture shows that the axis of rotation is the line \( (x = y = z) \);
  • applying the rotation three times gives the identity, so the angle of rotation must be \( 120^\circ \) .

But can I find the axis/angle of rotation using linear algebra? Since axes of rotation are fixed lines, they correspond to eigenvectors with eigenvalue \( =1 \). The matrix \( M \) has eigenvector \( \vec n = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} \) and, indeed, this vector spans the line \( (x=y=z) \). Furthermore, the angle of rotation can be found by studying what the matrix does to vectors orthogonal to \( \vec n \). For example, the vector \( \vec u = \begin{pmatrix} 1 \\ -1 \\ 0 \end{pmatrix} \) is mapped to \( \vec v = M \vec u = \begin{pmatrix} 0 \\ 1 \\ -1 \end{pmatrix} \). Using the dot-product, it follows that the angle between \( \vec u \) and \( \vec v \) is \( 120^\circ \). Alternatively, I can find the angle of rotation by applying a change of basis, the most fundamental tool of linear algebra. Basically, I'm introducing a new cartesian coordinate system \( x'y'z' \), where \( z' \) is parallel to \( \vec n \), \( x' \) to \( \vec u \), and \( y' \) to \( \vec w = \vec n \times \vec u \). The unit vectors of \( \vec u, \vec w, \vec n \) give the columns of the change of basis matrix: \[ C = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \\ -\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \\ 0 & -\frac{2}{\sqrt{6}} & \frac{1}{\sqrt{3}} \end{pmatrix} \] To change basis, conjugate \( M \) with \( C \): \[ M' = C^{-1} M C = \begin{pmatrix}-\frac{1}{2} & -\frac{\sqrt{3}}{2} & 0 \\ \frac{\sqrt{3}}{2} & -\frac{1}{2} & 0 \\ 0 & 0 & 1\end{pmatrix} =\begin{pmatrix}\cos 120^\circ & -\sin 120^\circ & 0 \\ \sin 120^\circ & \cos 120^\circ & 0 \\ 0 & 0 & 1\end{pmatrix} \] The matrix \( M' \) tells us that the \( z' \)-axis is fixed and the \( x'y' \)-plane is rotated \( 120^\circ \).

In general, here's how to switch from axis/angle of rotation description to a matrix. Use the angle of rotation, \( \theta \), to get the matrix \( M' = \begin{pmatrix}\cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1\end{pmatrix} \). Then let the axis of rotation be the \( z' \)-axis of a new \( x'y'z' \)-cartesian system, which gives you a change of basis matrix \( C \). Finally, the matrix corresponding to the rotation is \( M = C M' C^{-1} \). \( M' \) is called the canonical form of the matrix \( M \) (after change of basis).

In 4D, it can be shown using linear algebra that any special orthogonal matrix \( M \) will have a canonical form \[ M' = C^{-1} M C =\begin{pmatrix}\cos \theta_1 & -\sin \theta_1 & 0 & 0 \\ \sin \theta_1 & \cos \theta_1 & 0 & 0 \\ 0 & 0 & \cos \theta_2 & -\sin \theta_2 \\ 0 & 0 & \sin\theta_2 & \cos\theta_2 \end{pmatrix} \] What this means is that a 4D-rotation can be described by introducing a new cartesian system \( x'y'z'w' \) and the rotation rotates the \( x'y' \)-plane and \( z'w' \)-plane by \( \theta_1 \) and \( \theta_2 \), respectively. So unlike 2D/3D-rotations that only require one angle of rotation, 4D-rotations are described by two. Also, 4D-rotations don't generally have axes of rotation. In fact, if neither angles of rotations are \( 0 \) then the rotation has a unique fixed point; if only one of the angles is \( 0 \) then the rotation has a fixed plane and is a simple rotation; therefore, the set of fixed points for a (nontrivial) 4D-rotation is either a point (0D) or a plane (2D). Compared with 2D and 3D rotations where the set of fixed points is a point (0D) and a line (1D), respectively.

In 5D, the canonical form of special orthogonal matrices is: \[ \begin{pmatrix}\cos \theta_1 & -\sin \theta_1 & 0 & 0 & 0 \\ \sin \theta_1 & \cos \theta_1 & 0 & 0 & 0\\ 0 & 0 & \cos \theta_2 & -\sin \theta_2 & 0 \\ 0 & 0 & \sin\theta_2 & \cos\theta_2 \\ 0 & 0 & 0 & 0 & 1\end{pmatrix} \] So general 5D-rotations have two angles of rotation and (at least) an axis of rotation. The set of fixed points is either a line (1D) or 3-space (3D). This can be generalized to higher dimensions to show that rotation in \( 2n \)- or \( (2n+1) \)-dimensions will have \( n \) angles of rotations and if you're in even/odd dimensions then the set of fixed points will have even/odd dimensions, respectively. This also allows to generalize the definition of simple rotations to rotations where all but one angles of rotation are \( 0 \). Therefore, by the canonical form, \( (2n)D \)- or \( (2n+1)D \)-rotations are compositions of \( n \) simple rotations.

Before ending the post, I would like to briefly mention another elegant description. Any 2D-rotation about the origin can represented by complex-multiplication with a number on the unit circle and treating the plane as the complex plane. Composing two rotations then corresponds to multiplying two complex numbers on the unit circle, which in turn corresponds to adding their arguments/angles as described earlier. Other than a change in perspective, there's not much gained from this compared to the angle or matrix descriptions. However, one might wonder if there's a way to represent higher dimension rotations as muliplications in more complicated number systems.

In 3D, we can extend the complex number system to the quaternions by adding two more imaginary numbers. A “quaternion number” consists of 4 real numbers (like how complex numbers consist of 2 real numbers) which can be used to represent the axis and angle of rotation. Compare this to the 9 real numbers needed to represent a rotation as a matrix. Composing rotations then corresponds to multiplying the quaternion numbers, which is more efficient than matrix multiplication. There are more efficiency considerations that make quaternions the preferred description of 3D-rotations in real-world applications. Quaternions can also model 4D-rotations but we can't extend the number system to generalize this idea to higher dimensions.

If you found this post interesting, here's an exercise for you to repeat most of it but with translations instead.

Exercise 1. What is a (Euclidean) translation? What properties do translations have? Does your definition easily generalize to more dimensions? How can you best describe translations? Are there any other types of orientation-preserving isometries of Euclidean space?

Or if you're familiar with hyperbolic space,

Exercise 2. Can you come up with descriptions of rotations of hyperbolic space (with a shared fixed point) that apply to any number of dimensions?

I'd like to thank Kristen for her feedback. The first draft was a mess, all over the place.