# Newton’s method works for convex real functions

###### Theorem 1.

Let $f\mathrm{:}I\mathrm{\to}\mathrm{R}$ be a convex differentiable function
on an interval^{} $I\mathrm{\subseteq}\mathrm{R}$, with at least one root.
Then the following sequence $\mathrm{\{}{x}_{n}\mathrm{\}}$
obtained from Newton’s method,

$${x}_{n+1}={x}_{n}-\frac{f({x}_{n})}{{f}^{\prime}({x}_{n})},$$ |

will converge^{} to a root of $f$, provided that
${f}^{\mathrm{\prime}}\mathit{}\mathrm{(}{x}_{\mathrm{0}}\mathrm{)}\mathrm{\ne}\mathrm{0}$ and ${x}_{\mathrm{1}}\mathrm{\in}I$ for the given starting point ${x}_{\mathrm{0}}\mathrm{\in}I$.

The proof will proceed in several steps. First, a simple converse^{} result:

###### Theorem 2.

Let $f\mathrm{:}I\mathrm{\to}\mathrm{R}$ be a differentiable^{} convex function,
and $\mathrm{\{}{x}_{n}\mathrm{\}}$ be the sequence in Theorem 1.
If it is convergent to a number $a\mathrm{\in}I$,
then $a$ is necessarily a root of $f$.

###### Proof.

We have

$$0=\underset{n\to \mathrm{\infty}}{lim}{f}^{\prime}({x}_{n})\cdot ({x}_{n}-{x}_{n+1})=\underset{n\to \mathrm{\infty}}{lim}f({x}_{n})=f(a).$$ |

(The first limit is zero by local boundedness of ${f}^{\prime}$ at $a$,
which follows from ${f}^{\prime}$ being finite and monotone^{}^{1}^{1}
Actually, a differentiable convex function must necessarily have a continuous^{} derivative^{},
for the derivative is increasing, and it cannot have any jump discontinuities by
Darboux’s Theorem (http://planetmath.org/DarbouxsTheorem)..)
∎

## Proof of Theorem 1

### Case A: when $I=\mathbb{R}$, ${f}^{\prime}>0$, and $f({x}_{0})\ge 0$

We claim that whenever $f({x}_{n})\ge 0$, then $f({x}_{n+1})\ge 0$ too.
Recall that the graph of a convex function $f$ always lies
above any of its tangent lines^{}.
So in particular, the point $({x}_{n+1},f({x}_{n+1}))$ is higher than the point
$({x}_{n+1},0)$, which is on the tangent line by definition.
By induction^{}, we have $f({x}_{n})\ge 0$ for all $n$.

We also have ${x}_{n+1}\le {x}_{n}$ for all $n$.
By hypothesis^{} the slope ${f}^{\prime}({x}_{n})$ of the tangent line is positive,
and $({x}_{n},f({x}_{n}))$ lies above the horizontal axis.
Then the intersection^{} point of the tangent line and the horizontal axis,
which is $({x}_{n+1},0)$, must be to the left of $({x}_{n},0)$.

So the sequence $\{{x}_{n}\}$ is decreasing. It converges to a real number or diverges to $-\mathrm{\infty}$. If the limit is a real number, by Theorem 2 that number is a root of $f$, and we are done.

If the limit is $-\mathrm{\infty}$, then we must have $f({x}_{n})>0$ for all ${x}_{n}$. Hence $f(x)>0$ for all $x$, as $f$ is monotone and ${x}_{n}$ can be arbitrarily large negative. In other words, $f$ has no root to begin with.

### Case B: when $I=\mathbb{R}$, $$, and $f({x}_{0})\ge 0$

This situation is the (left-right) mirror of Case A, except that because the slope of the curve is negative, the sequence $\{{x}_{n}\}$ is increasing this time. We still have ${x}_{n}\ge 0$, so the sequence either converges to a root of $f$, or diverges to $+\mathrm{\infty}$ (in which case $f$ has no root).

### Case C: when $I=\mathbb{R}$, ${f}^{\prime}({x}_{0})>0$, and $f({x}_{0})\ge 0$

To begin, note that the first part of Case A, asserting that $f({x}_{n})\ge 0$ for all $n$, still goes through only assuming ${f}^{\prime}({x}_{n})\ne 0$ — in which case the sequence would not be defined after the point ${x}_{n}$. We show that, in fact, ${f}^{\prime}({x}_{n})>0$ for all $n$, so the rest of Case A goes through.

Suppose $n$ is the smallest integer for which ${f}^{\prime}({x}_{n+1})\le 0$. Also assume $f({x}_{n}),f({x}_{n+1})\ne 0$, otherwise we would be done anyway. The reasoning in Case A shows $$.

The function $f$ is strictly positive on $[{x}_{n+1},{x}_{n}]$, for the graph of $f$ lies above the tangent line segment extending from $({x}_{n+1},0)$ to $({x}_{n},f({x}_{n}))$, which in turn lies strictly above the horizontal axis. Since $f$ is decreasing to the left of $[{x}_{n+1},{x}_{n}]$ and increasing to the right, $f$ never touches the horizontal axis anywhere. This contradicts our hypothesis that $f$ has roots.

### Case D: when $I=\mathbb{R}$, $$, and $f({x}_{0})\ge 0$

This situation is the (left-right) mirror
of Case C. The same argument^{} shows that $$
for all $n$ (or $f({x}_{n})=0$ for some $n$), and the
argument of Case B applies.

### Case E: for general intervals $I$

The only concern is that the iteration $\{{x}_{n}\}$ might go outside the interval $I$. We show that it does not.

Suppose $f({x}_{0})\ge 0$. Then as we have proved, $f({x}_{n})\ge 0$ for all $n$. Suppose ${x}_{n}\in I$ but ${x}_{n+1}\notin I$. Then $f$ has no root, because the graph of $f$ lies above the tangent line from $({x}_{n},f({x}_{n}))$ to $({x}_{n+1},0)$ which lies above the horizontal axis.

The *limit* of ${x}_{n}$ could conceivably go just outside
the interval $I$.
But this case is similar^{} to the case ${x}_{n}\to \pm \mathrm{\infty}$
already considered in Case A:
$f$ possesses no root then.

Finally, suppose $$, the case we have ignored all along. Our hypotheses assume that $f({x}_{1})$ is defined. We immediately have $f({x}_{1})\ge 0$, since the graph of $f$ lies above the tangent line from $({x}_{0},f({x}_{0}))$ to $({x}_{1},0)$. But this just brings us back to the other cases.

## Examples

Newton’s method for finding square roots of positive numbers $\alpha $ (which reduces to the ancient divide-and-average method) always converges, for the target function $f(x)={x}^{2}-\alpha $ is convex.

## References

- 1 Michael Spivak. Calculus, third edition. Publish or Perish, 1994.

Title | Newton’s method works for convex real functions |
---|---|

Canonical name | NewtonsMethodWorksForConvexRealFunctions |

Date of creation | 2013-03-22 15:41:28 |

Last modified on | 2013-03-22 15:41:28 |

Owner | stevecheng (10074) |

Last modified by | stevecheng (10074) |

Numerical id | 7 |

Author | stevecheng (10074) |

Entry type | Theorem |

Classification | msc 49M15 |

Classification | msc 65H05 |

Classification | msc 26A06 |

Synonym | Newton’s method works for concave real functions |