# Newton’s method works for convex real functions

###### Theorem 1.

Let $f\colon I\to\mathbb{R}$ be a convex differentiable function on an interval   $I\subseteq\mathbb{R}$, with at least one root. Then the following sequence $\{x_{n}\}$ obtained from Newton’s method,

 $x_{n+1}=x_{n}-\frac{f(x_{n})}{f^{\prime}(x_{n})}\,,$

will converge  to a root of $f$, provided that $f^{\prime}(x_{0})\neq 0$ and $x_{1}\in I$ for the given starting point $x_{0}\in I$.

###### Theorem 2.

Let $f\colon I\to\mathbb{R}$ be a differentiable  convex function, and $\{x_{n}\}$ be the sequence in Theorem 1. If it is convergent to a number $a\in I$, then $a$ is necessarily a root of $f$.

###### Proof.

We have

 $0=\lim_{n\to\infty}f^{\prime}(x_{n})\cdot(x_{n}-x_{n+1})=\lim_{n\to\infty}f(x_% {n})=f(a)\,.$

(The first limit is zero by local boundedness of $f^{\prime}$ at $a$, which follows from $f^{\prime}$ being finite and monotone   11 Actually, a differentiable convex function must necessarily have a continuous   derivative  , for the derivative is increasing, and it cannot have any jump discontinuities by Darboux’s Theorem (http://planetmath.org/DarbouxsTheorem)..) ∎

## Proof of Theorem 1

### Case A: when $I=\mathbb{R}$, $f^{\prime}>0$, and $f(x_{0})\geq 0$

We claim that whenever $f(x_{n})\geq 0$, then $f(x_{n+1})\geq 0$ too. Recall that the graph of a convex function $f$ always lies above any of its tangent lines  . So in particular, the point $(x_{n+1},f(x_{n+1}))$ is higher than the point $(x_{n+1},0)$, which is on the tangent line by definition. By induction  , we have $f(x_{n})\geq 0$ for all $n$.

We also have $x_{n+1}\leq x_{n}$ for all $n$. By hypothesis  the slope $f^{\prime}(x_{n})$ of the tangent line is positive, and $(x_{n},f(x_{n}))$ lies above the horizontal axis. Then the intersection  point of the tangent line and the horizontal axis, which is $(x_{n+1},0)$, must be to the left of $(x_{n},0)$.

So the sequence $\{x_{n}\}$ is decreasing. It converges to a real number or diverges to $-\infty$. If the limit is a real number, by Theorem 2 that number is a root of $f$, and we are done.

If the limit is $-\infty$, then we must have $f(x_{n})>0$ for all $x_{n}$. Hence $f(x)>0$ for all $x$, as $f$ is monotone and $x_{n}$ can be arbitrarily large negative. In other words, $f$ has no root to begin with.

### Case B: when $I=\mathbb{R}$, $f^{\prime}<0$, and $f(x_{0})\geq 0$

This situation is the (left-right) mirror of Case A, except that because the slope of the curve is negative, the sequence $\{x_{n}\}$ is increasing this time. We still have $x_{n}\geq 0$, so the sequence either converges to a root of $f$, or diverges to $+\infty$ (in which case $f$ has no root).

### Case C: when $I=\mathbb{R}$, $f^{\prime}(x_{0})>0$, and $f(x_{0})\geq 0$

To begin, note that the first part of Case A, asserting that $f(x_{n})\geq 0$ for all $n$, still goes through only assuming $f^{\prime}(x_{n})\neq 0$ — in which case the sequence would not be defined after the point $x_{n}$. We show that, in fact, $f^{\prime}(x_{n})>0$ for all $n$, so the rest of Case A goes through.

Suppose $n$ is the smallest integer for which $f^{\prime}(x_{n+1})\leq 0$. Also assume $f(x_{n}),f(x_{n+1})\neq 0$, otherwise we would be done anyway. The reasoning in Case A shows $x_{n+1}. Figure 3: Case C: a tangent line separates f from the horizontal axis

The function $f$ is strictly positive on $[x_{n+1},x_{n}]$, for the graph of $f$ lies above the tangent line segment extending from $(x_{n+1},0)$ to $(x_{n},f(x_{n}))$, which in turn lies strictly above the horizontal axis. Since $f$ is decreasing to the left of $[x_{n+1},x_{n}]$ and increasing to the right, $f$ never touches the horizontal axis anywhere. This contradicts our hypothesis that $f$ has roots.

### Case D: when $I=\mathbb{R}$, $f^{\prime}(x_{0})<0$, and $f(x_{0})\geq 0$

This situation is the (left-right) mirror of Case C. The same argument  shows that $f^{\prime}(x_{n})<0$ for all $n$ (or $f(x_{n})=0$ for some $n$), and the argument of Case B applies.

### Case E: for general intervals $I$

The only concern is that the iteration $\{x_{n}\}$ might go outside the interval $I$. We show that it does not.

Suppose $f(x_{0})\geq 0$. Then as we have proved, $f(x_{n})\geq 0$ for all $n$. Suppose $x_{n}\in I$ but $x_{n+1}\notin I$. Then $f$ has no root, because the graph of $f$ lies above the tangent line from $(x_{n},f(x_{n}))$ to $(x_{n+1},0)$ which lies above the horizontal axis.

The limit of $x_{n}$ could conceivably go just outside the interval $I$. But this case is similar   to the case $x_{n}\to\pm\infty$ already considered in Case A: $f$ possesses no root then.

Finally, suppose $f(x_{0})<0$, the case we have ignored all along. Our hypotheses assume that $f(x_{1})$ is defined. We immediately have $f(x_{1})\geq 0$, since the graph of $f$ lies above the tangent line from $(x_{0},f(x_{0}))$ to $(x_{1},0)$. But this just brings us back to the other cases.

Newton’s method for finding square roots of positive numbers $\alpha$ (which reduces to the ancient divide-and-average method) always converges, for the target function $f(x)=x^{2}-\alpha$ is convex.