# proof of chain rule (several variables)

We first consider the case $m=1$ i.e. $G:I\to {\mathbb{R}}^{n}$ where $I\subset \mathbb{R}$ is a neighbourhood of a point ${x}_{0}\in \mathbb{R}$ and $F:U\subset {\mathbb{R}}^{n}\to \mathbb{R}$ is defined on a neighbourhood $U$ of ${y}_{0}=G({x}_{0})$ such that $G(I)\subset U$. We suppose that both $G$ is differentiable^{} at the point ${x}_{0}$ and $F$ is differentiable in ${y}_{0}$. We want to compute the derivative of the compound
function $H(x)=F(G(x))$ at $x={x}_{0}$.

By the definition of derivative (using Landau notation^{}) we have

$$F({y}_{0}+k)=F({y}_{0})+DF({y}_{0})k+o(|k|).$$ |

Choose any $h\ne 0$ such that ${x}_{0}+h\in I$ and set $k=G({x}_{0}+h)-G({x}_{0})$ to obtain

$\frac{H({x}_{0}+h)-H({x}_{0})}{h}$ | $={\displaystyle \frac{F(G({x}_{0}+h))-F(G({x}_{0}))}{h}}$ | ||

$={\displaystyle \frac{F(G({x}_{0})+k)-F(G({x}_{0}))}{h}}={\displaystyle \frac{F({y}_{0}+k)-F({y}_{0})}{h}}$ | |||

$={\displaystyle \frac{DF({y}_{0})(G({x}_{0}+h)-G({x}_{0}))+o(|G({x}_{0}+h)-G({x}_{0})|)}{h}}$ | |||

$=DF({y}_{0}){\displaystyle \frac{G({x}_{0}+h)-G({x}_{0})}{h}}+{\displaystyle \frac{o(|G({x}_{0}+h)-G({x}_{0})|)}{h}}.$ |

Letting $h\to 0$ the first term of the sum converges to $DF({y}_{0}){G}^{\prime}({x}_{0})$ hence we want to prove that the second term converges to $0$. Indeed we have

$$\left|\frac{o(|G({x}_{0}+h)-G({x}_{0})|)}{h}\right|=\left|\frac{o(|G({x}_{0}+h)-G({x}_{0})|)}{|G({x}_{0}+h)-G({x}_{0})|}\right|\cdot \left|\frac{G({x}_{0}+h)-G({x}_{0})}{h}\right|.$$ |

By the definition of $o(\cdot )$ the first fraction tends to $0$, while
the second fraction tends to the absolute value^{} of ${G}^{\prime}({x}_{0})$. Thus the product
tends to $0$, as needed.

Consider now the general case $G:V\subset {\mathbb{R}}^{m}\to U\subset {\mathbb{R}}^{n}$.
Given $v\in {\mathbb{R}}^{m}$ we are going to compute the directional derivative^{}

$$\frac{\partial F\circ G}{\partial v}({x}_{0})=\frac{dF\circ g}{dt}(0)$$ |

where $g(t)=G({x}_{0}+tv)$ is a function of a single variable $t\in \mathbb{R}$. Thus we fall back to the previous case and we find that

$$\frac{\partial F\circ G}{\partial v}({x}_{0})=DF(G({x}_{0})){g}^{\prime}(0).=DF(G({x}_{0}))\frac{\partial G}{\partial v}({x}_{0})$$ |

In particular when $v={e}_{k}$ is the $k$-th coordinate vector, we find

$${g}^{\prime}(0)={D}_{{x}_{k}}F\circ G({x}_{0})=DF(G({x}_{0})){D}_{{x}_{k}}G({x}_{0})=\sum _{i=1}^{n}{D}_{{y}_{i}}G({x}_{0}){D}_{{x}_{k}}{G}^{i}({x}_{0})$$ |

which can be compactly written

$$DF\circ G({x}_{0})=DF(G({x}_{0}))DG({x}_{0}).$$ |

Title | proof of chain rule (several variables) |
---|---|

Canonical name | ProofOfChainRuleseveralVariables |

Date of creation | 2013-03-22 16:05:07 |

Last modified on | 2013-03-22 16:05:07 |

Owner | paolini (1187) |

Last modified by | paolini (1187) |

Numerical id | 6 |

Author | paolini (1187) |

Entry type | Proof |

Classification | msc 26B12 |