# Newton’s method

Let ${\mathbf{f}}$ be a differentiable function from $\mathbb{R}^{n}$ to $\mathbb{R}^{n}$. Newton’s method consists of starting at an $\mathbf{a_{0}}$ for the equation ${\mathbf{f}}(\mathbf{x})=\mathbf{0}$. Then the function is linearized at $\mathbf{a_{0}}$ by replacing the increment ${\mathbf{f}}(\mathbf{x})-{\mathbf{f}}(\mathbf{a_{0}})$ by a linear function of the increment $[\mathbf{D}(\mathbf{a_{0}})](\mathbf{x}-\mathbf{a_{0}})$.

Now we can solve the linear equation ${\mathbf{f}}(\mathbf{a_{0}})+[\mathbf{D}(\mathbf{a_{0}})](\mathbf{x}-\mathbf{a% _{0}})=\mathbf{0}$. Since this is a system of n linear equations in n unknowns, $[\mathbf{D}(\mathbf{a_{0}})](\mathbf{x}-\mathbf{a_{0}})=-{\mathbf{f}}(\mathbf{% a_{0}})$ can be likened to the general linear system $A\mathbf{x}=\mathbf{b}$.

Therefore, if $[\mathbf{D}(\mathbf{a_{0}})]$ is invertible, then $\mathbf{x}=\mathbf{a_{0}}-[\mathbf{D}(\mathbf{a_{0}})]^{-1}{\mathbf{f}}(% \mathbf{a_{0}})$. By renaming $\mathbf{x}$ to $\mathbf{a_{1}}$, you can reiterate Newton’s method to get an $\mathbf{a_{2}}$. Thus, Newton’s method states

 $\mathbf{a}_{n+1}=\mathbf{a}_{n}-[\mathbf{D}{\mathbf{f}}(\mathbf{a}_{n})]^{-1}{% \mathbf{f}}(\mathbf{a}_{n})$

Thus we get a series of $\mathbf{a}$’s that we hope will converge to $\mathbf{x}$ such that ${\mathbf{f}}(\mathbf{x})=\mathbf{0}$. When we solve an equation of the form ${\mathbf{f}}(\mathbf{x})=\mathbf{0}$, we call the solution a root of the equation. Thus, Newton’s method is used to find roots of nonlinear equations.

Unfortunately, Newton’s method does not always converge. There are tests for neighborhoods of $\mathbf{a_{0}}$’s where Newton’s method will converge however. One such test is Kantorovitch’s theorem, which combines what is needed into a concise mathematical equation.

Corollary 1: Newton’s Method in one dimension - The above equation is simplified in one dimension to the well-used

 $a_{1}=a_{0}-\frac{f(a_{0})}{f^{\prime}(a_{0})}$

This intuitively cute equation is pretty much the equation of first year calculus. :)

Corollary 2: Finding a square root - So now that you know the equation, you need to know how to use it, as it is an algorithm. The construction of the primary equation, of course is the important part. Let’s see how you do it if you want to find a square root of a number b.

We want to find a number x (x for unknown), such that $x^{2}=b$. You might think “why not find a number such that $x=\sqrt{b}$ ?” Well, the problem with that approach is that we don’t have a value for $\sqrt{b}$, so we’d be right back where we started. However, squaring both sides of the equation to get $x^{2}=b$ lets us work with the number we do know, $b$.) Back to $x^{2}=b$. With some manipulation, we see this means that $x^{2}-b=0$ ! Thus we have our $f(x)=0$ scenario.

We can see that $f^{\prime}(x)=2x$ thus, $f^{\prime}(a_{0})=2a_{0}$ and $f(a_{0})=a_{0}^{2}-b$. Now we have all we need to carry out Newton’s method. By renaming $x$ to be $a_{1}$, we have

 $a_{1}=a_{0}-\frac{1}{2a_{0}}(a_{0}^{2}-b)=\frac{1}{2}\left(a_{0}+\frac{b}{a_{0% }}\right)$

.

The equation on the far right is also known as the divide and average method, for those who have not learned the full Newton’s method, and just want a fast way to find square roots. Let’s see how this works out to find the square root of a number like 2:

Let $x^{2}=2$

 $x^{2}-2=0=f(x)$

Thus, by Newton’s method,…

 $a_{1}=a_{0}-\frac{a_{0}^{2}-2}{2a_{0}}$

All we did was plug in the expressions $f(a_{0})$ and $f^{\prime}(a_{0})$ where Newton’s method asks for them. Now we have to pick an $a_{0}$. Hmm, since

 $\sqrt{1}<\sqrt{2}<\sqrt{4}$
 $1<\sqrt{2}<2$

let’s pick a reasonable number between 1 and 2 like 1.5

 $a_{1}=1.5-\frac{1.5^{2}-2}{2(1.5)}$
 $a_{1}=1.41\bar{6}$

Looks like our guess was too high. Let’s see what the next iteration says

 $a_{2}=1.41\bar{6}-\frac{1.41\bar{6}^{2}-2}{2(1.41\bar{6})}$
 $a_{2}=1.414215686\dots$

getting better =) You can use your calculator to find that

 $\sqrt{2}=1.414213562\dots$

Not bad for only two iterations! Of course, the more you iterate, the more decimal places your $a_{n}$ will be accurate to. By the way, this is also how your calculator/computer finds square roots!

Geometric Interpretation: Consider an arbitrary function $f\colon\mathbb{R}\rightarrow\mathbb{R}$ such as $f(x)=x^{2}-b$. Say you wanted to find a root of this function. You know that in the neighborhood of $x=a_{0}$, there is a root (Maybe you used Kantorovitch’s theorem or tested and saw that the function changed signs in this neighborhood). We want to use our knowledge $a_{0}$ to find an $a_{1}$ that is a better approximation to $x_{0}$ (in this case, closer to it on the x-axis).

So we know that $x_{0}\leq a_{1}\leq a_{0}$ or in another case $a_{0}\leq a_{1}\leq x_{0}$. What is an efficient algorithm to bridge the gap between $a_{0}$ and $x_{0}$ ? Let’s look at a tangent line to the graph.

Note that the line intercepts the x-axis between $a_{0}$ and $x_{0}$, which is exactly what we want. The slope of this tangent line is $f^{\prime}(a_{0})$ by definition of the derivative at $a_{0}$, and we know one point on the line is $(a_{1},0)$, since that is the x-intercept. That is all we need to find the formula of the line, and solve for $a_{1}$.

 $\displaystyle y-y_{1}$ $\displaystyle=m(x-x_{1})$ $\displaystyle f(a_{0})-0$ $\displaystyle=f^{\prime}(a_{0})(a_{0}-a_{1})$ Substituting $\displaystyle\frac{f(a_{0})}{f^{\prime}(a_{0})}$ $\displaystyle=a_{0}-a_{1}$ $\displaystyle-a_{1}$ $\displaystyle=-a_{0}+\frac{f(a_{0})}{f^{\prime}(a_{0})}$ Aesthetic change. Flipped the equation around. $\displaystyle a_{1}$ $\displaystyle=a_{0}-\frac{f(a_{0})}{f^{\prime}(a_{0})}$ Newton’s method!
 Title Newton’s method Canonical name NewtonsMethod Date of creation 2013-03-22 11:58:05 Last modified on 2013-03-22 11:58:05 Owner alozano (2414) Last modified by alozano (2414) Numerical id 28 Author alozano (2414) Entry type Algorithm Classification msc 49M15 Synonym Newton-Raphson method Related topic LipschitzCondition Related topic KantorovitchsTheorem Related topic Derivative Related topic Superconvergence Related topic FixedPointsOfNormalFunctions