This is the multivariable version of the one-variable chain rule from Calculus of the Curves, where a curve is differentiated by following how the parameter moves the point.
If ∇f=(fx,fy) denotes the gradient of f and r(t)=(x(t),y(t)) denotes the path, then the chain rule can be written compactly as a dot product:
dtdf(r(t))=∇f(r(t))⋅r′(t).
This notation is common in courses that use linear algebra, but it says exactly the same thing as the formula above.
2. Interactive 3D graph
Consider
z=3x2+4xy+5y2,x=cost,y=sint.
Substituting the path into the surface gives the composed function
f(x(t),y(t))=3cos2t+4costsint+5sin2t.
The graph below shows the surface together with the red curve traced by the path t↦(cost,sint).
Surface z = 3x^2 + 4xy + 5y^2 with the red curve x = cos t, y = sin t
The important pattern is unchanged: each output derivative is a sum over all the paths that feed into it.
The two formulas above can be written as a single matrix equation. If we arrange the partial derivatives into a row vector on the left and a matrix on the right, we get the Jacobian form:
(FsFt)=(fxfy)(xsysxtyt).
Each column of this matrix product reproduces one of the scalar chain rule formulas above.
4. Practical example: change from Cartesian to polar coordinates
Set
x=rcosθ,y=rsinθ.
If f(x,y) is a function of two variables, then the chain rule gives
∂r∂f=fxcosθ+fysinθ,∂θ∂f=−rfxsinθ+rfycosθ.
This is the most common worked example of a two-variable chain rule because the coordinate change appears everywhere later in multivariable calculus.
The two formulas above are the rows of a matrix product known as the Jacobian of the polar change of variables:
(∂r∂f∂θ∂f)=(fxfy)(cosθsinθ−rsinθrcosθ).
The 2×2 matrix on the right is the Jacobian matrix of the transformation (r,θ)↦(x,y).
5. Example: z=exsiny
Now substitute the polar coordinate change into
z=exsiny.
Then
z(r,θ)=ercosθsin(rsinθ).
Differentiate with respect to r:
zr=ercosθsin(rsinθ)cosθ+ercosθcos(rsinθ)sinθ.
You can also write that as
zr=ercosθsin(rsinθ+θ).
Differentiate with respect to θ:
zθ=−rercosθsinθsin(rsinθ)+rercosθcosθcos(rsinθ).
Equivalently,
zθ=rercosθcos(rsinθ+θ).
This is a clean example of how the chain rule turns a simple expression in x and y into a more complicated but still manageable expression in r and θ.
6. Most general version of the chain rule
Suppose u is a differentiable function of n variables x1,x2,…,xn, and each xj is a differentiable function of m variables t1,t2,…,tm.
In other words: to find the partial derivative of u with respect to any one parameter ti, multiply the partial derivative of u with respect to each intermediate variable xj by the partial derivative of that xj with respect to ti, then add all those products together.
All earlier versions of the chain rule are special cases. For n=2 and m=1 (a single parameter t) this becomes the formula in Section 1. For n=2 and m=2 it becomes the formulas in Section 3.
The same formula can be expressed as a single matrix multiplication. Let g:Rm→Rn and f:Rn→Rk be differentiable, and let D denote the matrix of all partial derivatives (the Jacobian). Then
D(f∘g)(u)=Df(g(u))Dg(u).
For scalar-valued f this is a row vector times a matrix; for vector-valued f it is matrix multiplication. Written out component by component:
∂uj∂(fi∘g)=ℓ=1∑n∂xℓ∂fi(g(u))∂uj∂gℓ(u).
This is identical to the scalar sum formula above with the indices renamed.
7. Implicit differentiation of F(x,y)
Suppose an equation
F(x,y)=0
defines y as a function of x near a point where Fy=0.
Write that local solution as y=y(x).
Now differentiate the identity
F(x,y(x))=0
with respect to x.
By the chain rule,
dxdF(x,y(x))=Fx(x,y(x))+Fy(x,y(x))dxdy.
Since the left side is the derivative of the constant function 0, it must equal 0. Therefore
Fx+Fydxdy=0,
and so
dxdy=−FyFx.
This formula is not separate from the chain rule. It is the chain rule applied to an implicit relation.
8. Example: x2+y2=1
Let
F(x,y)=x2+y2−1.
Then
Fx=2x,Fy=2y,
so the implicit derivative is
dxdy=−FyFx=−yx.
On the upper-right part of the circle, use the point
(22,22).
At that point the slope is
dxdy=−1.
So the tangent line is
y−22=−(x−22),
or equivalently
y=−x+2.
Unit circle with the tangent line y = -x + sqrt(2) at t = pi/4