The Jacobian & Determinants

By Jean Pierre Mutanguha

Also verify that the determinant computes volume of parallelotopes.

Years ago I used the Jacobian determinant to find the area of an astroid and made a note about finding out where it came from. There was a class on differential topology last fall in which it may have been proven but, alas, I was too busy to take it. However, I am teaching Calc 3 this summer and I got curious about it once again. The textbook has an exercise that roughly outlines why we use the Jacobian determinant. This investigation led into another interesting fact: determinants of matrices calculate area/volume/etc. Incidentally, this is the last week of Calc 3 and I'm about to teach Divergence Theorem. It just occured to me that one can potentially find the Jacobian connection with it, so I will look into it for next week; for now, here is what I've managed to figure out so far.

Suppose you have a region \( R \) in the \( xy \)-plane and a map \( \varphi \) from the \( xy \)-plane to the \( uv \)-plane which is a homeomorphism when restricted to \( R \); let \( R' =\varphi(R) \). (See image.)

Fig 1. Transformation \( \varphi \)

The map \( \varphi \) can be represented as a pair of real-valued functions:

\[ \varphi : \left\{ \begin{aligned} u &= u(x, y) \\ v &= v(x,y) \end{aligned}\right. \]

Let's focus on one rectangle in the partition of \( R \) with lower-left corner \( (a,b) \) and length (height resp.) of \( \Delta x \) (\( \Delta y \) resp.). If the rectangle is small enough, then its image under \( \varphi \) is approximately a parallelogram with vertices at \[ \begin{aligned} \varphi(a,b) &=\left( u(a,b), v(a,b) \right) \\ \varphi(a+\Delta x, b) &= \left( u(a+\Delta x, b), v(a+\Delta x, b) \right) \\ \varphi(a +\Delta x, b + \Delta y) &= \left( u(a+\Delta x, b+\Delta x), v(a+\Delta x, b+\Delta x) \right) \\ \varphi(a, b + \Delta y) &= \left( u(a, b+\Delta x), v(a, b+\Delta x)\right) \end{aligned} \] What is the area of this parallelogram? Using the linear approximations of \( u \) and \( v \), we have: \[ \begin{aligned} u(a+\Delta x, b) - u(a,b) &\approx \left.\frac{\partial u}{\partial x}\right|_{(a,b)}\Delta x \\ v(a+\Delta x, b) - v(a,b) &\approx \left.\frac{\partial v}{\partial x}\right|_{(a,b)}\Delta x \\ u(a, b+\Delta y) - u(a,b) &\approx \left.\frac{\partial u}{\partial y}\right|_{(a,b)}\Delta y \\ v(a, b+\Delta y) - v(a,b) &\approx \left.\frac{\partial v}{\partial y}\right|_{(a,b)}\Delta y \end{aligned} \]

So the “parallelogram” sides can be described by the vectors \[ \left< \frac{\partial u}{\partial x}, \frac{\partial v}{\partial x}\right> \Delta x, \qquad \left< \frac{\partial u}{\partial y}, \frac{\partial v}{\partial y}\right> \Delta y \] and its area is given by the determinant defined by the two vectors: \[ \Delta A_{uv} \approx \pm \begin{vmatrix} \frac{\partial u}{\partial x} & \frac{\partial v}{\partial x} \\ \frac{\partial u}{\partial y} & \frac{\partial v}{\partial y} \end{vmatrix}\Delta x \Delta y \] Thus \( dA_{uv} \approx |J|\, dA_{xy}. \quad \Box \)

Fig 2. What could go wrong with the approximation?

The first crucial idea is that the parallelogram is an nice approximation of the image of small enough rectangles. While we can always approximate the image as a parallelogram, the hope is the approximation has a negligible error when we're taking the riemann sum to compute \( \displaystyle \iint_{R'} f(u,v)\, dA_{uv}. \) This is where some hypotheses on \( \varphi \) aside from being a homeomorphism on \( R \) may be needed. This is an analysis (or differential topology?) question that I'm just not ready to tackle yet; hopefully, the Divergence Theorem will enlighten me.

The second key idea is that the determinant computes the area of my parallelogram. To be able to change variables in \( 3D \), I'd need the determinant to compute the volume of a parallelepiped, and similarly for higher dimensions. So to this effect, I need to convince myself that the \( (n \times n) \)-determinant computes the \( n \)-volume of an \( n \)-parallelotope.

Let \( V = {\mathbb R}^n \) be a vector space over \( {\mathbb R} \) and, essentially, I'm going to define a function \( d: \mathrm{End}_{\mathbb R}(V) \to {\mathbb R} \) which takes a linear transformation \( \phi : V \to V \) and assigns a value \( d(\phi) \in {\mathbb R} \). \( d(\phi) \) will compute how much \( \phi \) scales volumes of parallelotopes and it turns out that \( d(\phi) = |\det (\phi)|. \) I will use the rational canonical form (RCF) to establish the connection. We proved RCF in my Abstract Algebra class last year and it is a consequence of the structure theorem of fin. gen. modules over a P.I.D.

For the rest of this post, I'm going to assume without proof that shearing and reflecting (swapping basis) preserves volume. One may take this as part of the definition of volume in a vector space. See the images below for the motivation behind this assumption:

Fig 3. Shearing in 2D.

Let \( { \hat e_1, \ldots \hat e_n } \subset V \) be a basis and for nonzero elements \( s_1, \ldots, s_n \in {\mathbb R} \), define \[ vol(s_1 \hat e_1, \ldots, s_n \hat e_n) := |s_1 \cdots s_n| \blacksquare \] I am defining relative volume with respect to (w.r.t.) the basis and \( (s\,\blacksquare) \) means: \( s \) times the volume of the parallelotope whose sides are the basis vectors \( { \hat e_1, \ldots, \hat e_n}. \) Evidently, scaling the sides of a \( n \)-cube should scale its \( n \)-volume by the product of the scalars.

When given \( n \) linearly independent vectors \( \vec v_i \in V \), then let \( S: V \to V \) be a composition of shear maps such that \( S \vec v_i \) is parallel to \( \hat e_i \) for all \( i \). \[ vol( \vec v_1, \ldots, \vec v_n ) := vol( S\vec v_1, \ldots, S \vec v_n) \] Since shearing does not affect volume, \( vol \) is well-defined. Extend \( vol \) to all \( n \)-tuples of vectors by setting \( vol( \vec v_1, \ldots, \vec v_n) = 0 \) if the vectors are linearly dependent.

For any linear transformation \( \phi: V \to V, \) define \( d(\phi) = vol(\phi \hat e_1, \ldots \phi \hat e_n)/\blacksquare \). Note that as \( vol \) is a scalar multiple of \( \blacksquare \), \( d(\phi) \) is a real number.

Theorem 1. \( vol(\phi \vec v_1, \ldots, \phi \vec v_n) = d(\phi) \cdot vol(\vec v_1, \ldots, \vec v_n) \) for all \( n \)-tuples of vectors in \( V \)

Corollary 2. \( d(\phi_1 \, \phi_2) = d(\phi_1) \, d(\phi_2) \), i.e., \( d \) is multiplicative

Corollary 3. \( d(\phi) \) is basis-independent. Equivalently, \( d \) is preserved under conjugation.

Proof of theorem: Case 1: Since shear maps preserve volume, the statement holds for all compositions of shear maps.

Case 2: Suppose \( \phi \) simply scales each basis vector, i.e., \( \phi(\hat e_i) = s_i \hat e_i \), then there exists a pair of compositions of shears \( S_1, S_2 \) such that \( S_1 \, \phi = \phi \, S_2 \), and both \( (S_1 \, \phi) \vec v_i \) and \( S_2 \vec v_i \) are parallel to \( \hat e_i \), for all \( i. \) That such a pair exists is an argument that I'll omit. Then by definition \[ \begin{aligned}vol(\phi \vec v_1, \ldots \phi v_n) &= vol((S_1 \, \phi)\vec v_1, \ldots, (S_1 \, \phi)\vec v_n) \\ &= vol((\phi \, S_2) \vec v_1, \ldots, (\phi \, S_2)\vec v_n ) \\ &= vol(s_1 \, S_2 \vec v_1, \ldots, s_n \, S_2 \vec v_n) & \text{ since } S_2\vec v_i \text{ is parallel to } \hat e_i\ &= (s_1 \cdots s_n) \, vol(S_2 \vec v_1, \ldots, S_2 \vec v_1 ) \\ &= d(\phi) \cdot vol(\vec v_1, \ldots , \vec v_n) \end{aligned} \]

Case 3: In general, let \( S_1, S_2 \) be shears such that \( S_1 \vec v_i \) and \( (S_2 \, \phi) \vec v_i \) are both parallel to \( \hat e_i \) for all \( i. \) Then \( \phi' = S_2 \, \phi \, S_1^{-1} \) scales each basis vector as in case 2 above. Therefore, \( \phi = S_2^{-1} \, \phi' \, S_1 \) and

\[ \begin{aligned}vol(\phi \vec v_1, \ldots \phi v_n) &= vol( (S_2^{-1} \, \phi' \, S_1 ) \vec v_1, \ldots, ( S_2^{-1} \, \phi' \, S_1)\vec v_n) \\ &= d(S_2^{-1}) d(\phi') d(S_1) \cdot vol( \vec v_1, \ldots, \vec v_n ) \\ &=d(\phi) \cdot vol(\vec v_1, \ldots, v_n) \end{aligned} \] with each step justified since the theorem already holds for \( \phi', S_1, \) and \( S_2^{-1}.\) \(~\Box\)

The theorem implies that even though \( vol \) is volume relative to the chosen basis for \( V \), \( d(\phi) \in {\mathbb R} \), a ratio of relative volumes, will not depend on the basis. Thus \( d(\phi) \) is invariant to a change in basis (Cor. 3). In general, given vectors \( \vec v_1, \ldots, \vec v_n \), let \( \phi \) be the linear transformation that maps \( \hat e_i \mapsto \vec v_i \). As \( d(\phi) \) measures how much any parallelotope grows under the transformation, \( vol(\vec v_1, \ldots , \vec v_n) = d(\phi) \blacksquare. \) By the rational canonical form (RCF), there is a basis \( \mathcal B = { \hat b_1 , \ldots, \hat b_n} \) for \( V \) such that \( \phi \) has the matrix form: \[ \phi = \begin{pmatrix}C_1 &0 &\cdots &0 \\ 0 &C_2 &\cdots &0\\ \vdots &\ddots &\ddots &\vdots\\ 0&\cdots&0&C_m \end{pmatrix} \] and for each \( 1 \le i \le m \), \( C_i \) is a companion matrix with the form:

\[ \begin{aligned} C_i &= \begin{pmatrix} 0 &0 &\cdots &-a_{0,i}\\ 1 &0 &\cdots &-a_{1,i}\\ \vdots &\ddots &\ddots &\vdots\\ 0 &\cdots &1 &-a_{k_i, i} \end{pmatrix} \\ &= S_i \begin{pmatrix} 0 &0 &\cdots &1\\ 1 &0 &\cdots &0 \\ \vdots &\ddots &\ddots &\vdots \\ 0 &\cdots &1 &0 \end{pmatrix} \begin{pmatrix} 1 &0 &\cdots &0 \\ 0 &1 &\cdots &0 \\ \vdots &\ddots &\ddots &\vdots\\ 0 &0 &\cdots &-a_{0,i} \end{pmatrix} \\ &= S_i \, T_i \, C'_i \end{aligned} \] for some (volume-preserving) shear matrix \( S_i \) and cycle of basis \( T_i; \) the latter is a composition of basis swaps. Therefore, the image of the basis \( { \hat b_1, \ldots \hat b_n } \) under \( \phi \) has the same volume as the parallelotope: \( C'_1 \times \ldots \times C'_m \) which has sides \[ { \hat b_1, \, \hat b_2, \ldots , \, (-a_{0,1}) \hat b_{k_1+1},\, \hat b_{k_1+2}, \, \ldots, \, (-a_{0,m-1}) \hat b_l, \, \hat b_{l+1},\, \ldots, \, (-a_{0,m}) \hat b_n } \] The map \( \rho_b \) that simply scales some elements of the basis \( { \hat b_1, \ldots, \hat b_n } \) is conjugate to a map \( \rho_e \) that simply scales some elements of basis \( { \hat e_1, \ldots, \hat e_n} \). The latter has \( d(\rho_e) = \prod |a_{0,i}|. \) By the corollaries, we get \[ d(\phi) = d(\rho_b) = d(\rho_e) = \prod_{i=1}^m |a_{i,0}| = | \det(\phi) | \] Note that I'm defining \( \det(\phi) \) as the alternating sum of (scaled) minors w.r.t. the basis \( \mathcal B. \)

In summary, RCF implies that any linear transformation \( \phi \) is a composition of maps \( S \, \psi \, \rho \) where

  • \( \psi \) acts on some basis \( \mathcal B \subset V \) with \( m \) orbits.
  • for each orbit, \( \rho \) scales exactly one element by \( -a_{i,0} \)
  • \( S \) is a shearing relative to the fixed set of \( (\psi \, \rho \, \psi^{-1}). \)

\( \psi \) and \( S \) preserve volume while \( \rho \) scales volumes by \( d(\phi) = \det(\phi) \). Alternatively, there is a nice way of defining \( \det(\phi) \) independent of a choice of basis using the exterior product; this is almost identical to my definition of \( d(\phi) \) but more rigorous. The connection between the determinant and volumes is self-evident in this definition.

Finally, in the same sense that the determinant of a linear transformation measures how much volumes are scaled globally, the Jacobian determinant at a point measures how much volumes are scaled locally.