Newton’s method works for convex real functions


Theorem 1.

Let f:IR be a convex differentiable function on an intervalMathworldPlanetmathPlanetmath IR, with at least one root. Then the following sequence {xn} obtained from Newton’s method,

xn+1=xn-f(xn)f(xn),

will convergePlanetmathPlanetmath to a root of f, provided that f(x0)0 and x1I for the given starting point x0I.

Obviously, “convex” can be replaced by “concavePlanetmathPlanetmath” in Theorem 1.

The proof will proceed in several steps. First, a simple converseMathworldPlanetmath result:

Theorem 2.

Let f:IR be a differentiableMathworldPlanetmath convex function, and {xn} be the sequence in Theorem 1. If it is convergent to a number aI, then a is necessarily a root of f.

Proof.

We have

0=limnf(xn)(xn-xn+1)=limnf(xn)=f(a).

(The first limit is zero by local boundedness of f at a, which follows from f being finite and monotoneMathworldPlanetmathPlanetmath11 Actually, a differentiable convex function must necessarily have a continuousMathworldPlanetmathPlanetmath derivativePlanetmathPlanetmath, for the derivative is increasing, and it cannot have any jump discontinuities by Darboux’s Theorem (http://planetmath.org/DarbouxsTheorem)..) ∎

Proof of Theorem 1

Case A: when I=, f>0, and f(x0)0

We claim that whenever f(xn)0, then f(xn+1)0 too. Recall that the graph of a convex function f always lies above any of its tangent linesMathworldPlanetmath. So in particular, the point (xn+1,f(xn+1)) is higher than the point (xn+1,0), which is on the tangent line by definition. By inductionMathworldPlanetmath, we have f(xn)0 for all n.

We also have xn+1xn for all n. By hypothesisMathworldPlanetmath the slope f(xn) of the tangent line is positive, and (xn,f(xn)) lies above the horizontal axis. Then the intersectionMathworldPlanetmath point of the tangent line and the horizontal axis, which is (xn+1,0), must be to the left of (xn,0).

So the sequence {xn} is decreasing. It converges to a real number or diverges to -. If the limit is a real number, by Theorem 2 that number is a root of f, and we are done.

If the limit is -, then we must have f(xn)>0 for all xn. Hence f(x)>0 for all x, as f is monotone and xn can be arbitrarily large negative. In other words, f has no root to begin with.

Figure 1: Case A: {xn} is monotonically decreasing

Case B: when I=, f<0, and f(x0)0

This situation is the (left-right) mirror of Case A, except that because the slope of the curve is negative, the sequence {xn} is increasing this time. We still have xn0, so the sequence either converges to a root of f, or diverges to + (in which case f has no root).

Figure 2: Case B: {xn} is monotonically increasing

Case C: when I=, f(x0)>0, and f(x0)0

To begin, note that the first part of Case A, asserting that f(xn)0 for all n, still goes through only assuming f(xn)0 — in which case the sequence would not be defined after the point xn. We show that, in fact, f(xn)>0 for all n, so the rest of Case A goes through.

Suppose n is the smallest integer for which f(xn+1)0. Also assume f(xn),f(xn+1)0, otherwise we would be done anyway. The reasoning in Case A shows xn+1<xn.

Figure 3: Case C: a tangent line separates f from the horizontal axis

The function f is strictly positive on [xn+1,xn], for the graph of f lies above the tangent line segment extending from (xn+1,0) to (xn,f(xn)), which in turn lies strictly above the horizontal axis. Since f is decreasing to the left of [xn+1,xn] and increasing to the right, f never touches the horizontal axis anywhere. This contradicts our hypothesis that f has roots.

Case D: when I=, f(x0)<0, and f(x0)0

This situation is the (left-right) mirror of Case C. The same argumentMathworldPlanetmath shows that f(xn)<0 for all n (or f(xn)=0 for some n), and the argument of Case B applies.

Case E: for general intervals I

The only concern is that the iteration {xn} might go outside the interval I. We show that it does not.

Suppose f(x0)0. Then as we have proved, f(xn)0 for all n. Suppose xnI but xn+1I. Then f has no root, because the graph of f lies above the tangent line from (xn,f(xn)) to (xn+1,0) which lies above the horizontal axis.

The limit of xn could conceivably go just outside the interval I. But this case is similarMathworldPlanetmathPlanetmath to the case xn± already considered in Case A: f possesses no root then.

Finally, suppose f(x0)<0, the case we have ignored all along. Our hypotheses assume that f(x1) is defined. We immediately have f(x1)0, since the graph of f lies above the tangent line from (x0,f(x0)) to (x1,0). But this just brings us back to the other cases.

Figure 4: Case E: f(x1) is always 0

Examples

Newton’s method for finding square roots of positive numbers α (which reduces to the ancient divide-and-average method) always converges, for the target function f(x)=x2-α is convex.

References

  • 1 Michael Spivak. Calculus, third edition. Publish or Perish, 1994.
Title Newton’s method works for convex real functions
Canonical name NewtonsMethodWorksForConvexRealFunctions
Date of creation 2013-03-22 15:41:28
Last modified on 2013-03-22 15:41:28
Owner stevecheng (10074)
Last modified by stevecheng (10074)
Numerical id 7
Author stevecheng (10074)
Entry type Theorem
Classification msc 49M15
Classification msc 65H05
Classification msc 26A06
Synonym Newton’s method works for concave real functions