# proof of chain rule (several variables)

We first consider the case $m=1$ i.e. $G\colon I\to\mathbb{R}^{n}$ where $I\subset\mathbb{R}$ is a neighbourhood of a point $x_{0}\in\mathbb{R}$ and $F\colon U\subset\mathbb{R}^{n}\to\mathbb{R}$ is defined on a neighbourhood $U$ of $y_{0}=G(x_{0})$ such that $G(I)\subset U$. We suppose that both $G$ is differentiable   at the point $x_{0}$ and $F$ is differentiable in $y_{0}$. We want to compute the derivative of the compound function $H(x)=F(G(x))$ at $x=x_{0}$.

 $F(y_{0}+k)=F(y_{0})+DF(y_{0})k+o(|k|).$

Choose any $h\neq 0$ such that $x_{0}+h\in I$ and set $k=G(x_{0}+h)-G(x_{0})$ to obtain

 $\displaystyle\frac{H(x_{0}+h)-H(x_{0})}{h}$ $\displaystyle=\frac{F(G(x_{0}+h))-F(G(x_{0}))}{h}$ $\displaystyle=\frac{F(G(x_{0})+k)-F(G(x_{0}))}{h}=\frac{F(y_{0}+k)-F(y_{0})}{h}$ $\displaystyle=\frac{DF(y_{0})(G(x_{0}+h)-G(x_{0}))+o(|G(x_{0}+h)-G(x_{0})|)}{h}$ $\displaystyle=DF(y_{0})\frac{G(x_{0}+h)-G(x_{0})}{h}+\frac{o(|G(x_{0}+h)-G(x_{% 0})|)}{h}.$

Letting $h\to 0$ the first term of the sum converges to $DF(y_{0})G^{\prime}(x_{0})$ hence we want to prove that the second term converges to $0$. Indeed we have

 $\left|\frac{o(|G(x_{0}+h)-G(x_{0})|)}{h}\right|=\left|\frac{o(|G(x_{0}+h)-G(x_% {0})|)}{|G(x_{0}+h)-G(x_{0})|}\right|\cdot\left|\frac{G(x_{0}+h)-G(x_{0})}{h}% \right|.$

By the definition of $o(\cdot)$ the first fraction tends to $0$, while the second fraction tends to the absolute value    of $G^{\prime}(x_{0})$. Thus the product tends to $0$, as needed.

Consider now the general case $G\colon V\subset\mathbb{R}^{m}\to U\subset\mathbb{R}^{n}$. Given $v\in\mathbb{R}^{m}$ we are going to compute the directional derivative  $\frac{\partial F\circ G}{\partial v}(x_{0})=\frac{dF\circ g}{dt}(0)$

where $g(t)=G(x_{0}+tv)$ is a function of a single variable $t\in\mathbb{R}$. Thus we fall back to the previous case and we find that

 $\frac{\partial F\circ G}{\partial v}(x_{0})=DF(G(x_{0}))g^{\prime}(0).=DF(G(x_% {0}))\frac{\partial G}{\partial v}(x_{0})$

In particular when $v=e_{k}$ is the $k$-th coordinate vector, we find

 $g^{\prime}(0)=D_{x_{k}}F\circ G(x_{0})=DF(G(x_{0}))D_{x_{k}}G(x_{0})=\sum_{i=1% }^{n}D_{y_{i}}G(x_{0})D_{x_{k}}G^{i}(x_{0})$

which can be compactly written

 $DF\circ G(x_{0})=DF(G(x_{0}))DG(x_{0}).$
Title proof of chain rule (several variables) ProofOfChainRuleseveralVariables 2013-03-22 16:05:07 2013-03-22 16:05:07 paolini (1187) paolini (1187) 6 paolini (1187) Proof msc 26B12