Newton's Method
Suppose we wish to solve the equation F(x,y) = (1,1) where F: R^2->R^2
is given by:
F(x,y)= (x^3 - 2*x*y^2 +2x, x^2*y -y^3 +2y)
Newton's method consists of taking an intial guess and then using
repeated linear approximation and evaluation to try and improve this guess.
> with(plots):
> with(linalg): Digits:=10;
A very natural way to enter an R^n valued function of several variables
is as a list of expressions.
> F := [x^3 - 2*x*y^2 + 2*x, x^2*y -y^3 +2*y];
# The derivative of F.
> DF := jacobian(F,[x,y]);
> q := [1,1];
A first step:
Suppose we take p[0] = (1,1) as an initial guess to the solution.
We are trying to solve F(p) = q.
> p[0] := [1,1];
Newton iteration is based on the linear solution suggested by
F(p[1]) - F(p[0]) approx DF(p[0])(p[1] - p[0]),
so
q - F(p[0]) is approx DF(p[0])(p[1] - p[0])
leading to the choice
p[1] = p[0] + (Df(p[0])^(-1) (q - F(p[0])).
The following helper function enables us to more easily use map to plug values into F and DF.
>
subs2 := proc(c,a,b)
subs(a,b,c)
end;
To be sure of staying numerical
map(evalf,map(subs2,F,x=p[0][1],y=p[0][2]))
would be useful.
> F_val := map(subs2,F,x=p[0][1],y=p[0][2]);
> DF_val := map(evalf,map(subs2,DF,x=p[0][1],y=p[0][2]));
This is one Newton step, producing the attempted improvement in
the solution
> p[1] := evalm(p[0]+ DF_val^(-1) &* (q- F_val));
Technically, p[1] is now a vector rather than a list.
This is relevant for graphing, since there lists may be required.
p_as_list[1] : = convert(p[1],list)
would fix this
> whattype(p[0]);
> whattype(p[1]);
This can all be done in a loop:
> Num_Steps := 10;
Repeating the earlier computations to have them in one place.
> F := [x^3 - 2*x*y^2 + 2*x, x^2*y -y^3 +2*y];
# The derivative of F.
> DF := jacobian(F,[x,y]);
> p[0] := [1,1];
> q := [1,1];
The workhorse loop:
(Changing od: to od; would print more output. There is also a simpler to
use print command for unformattet output.)
>
for i from 1 to Num_Steps do
F_val := map(subs2,F,x=p[i-1][1],y=p[i-1][2]);
printf(`Old function value is: %a \n`,F_val);
DF_val := map(evalf,map(subs2,DF,x=p[i-1][1],y=p[i-1][2]));
p[i] := evalm(p[i-1]+ DF_val^(-1) &* (q- F_val));
printf(`\tPoint %d is: %a \n`,i,convert(p[i],list));
printf(`\n`);
od:
Old function value is: [1, 2]
Point 1 is: [.5000000000, .6250000000]
Old function value is: [.7343750000, 1.162109375]
Point 2 is: [.5288380696, .4579199597]
Old function value is: [.9837912861, .9478847069]
Point 3 is: [.5461440145, .4844155067]
Old function value is: [.9988736552, .9996470584]
Point 4 is: [.5466340234, .4844742582]
Old function value is: [1.000000334, 1.000000143]
Point 5 is: [.5466338690, .4844742198]
Old function value is: [1.000000000, .9999999999]
Point 6 is: [.5466338690, .4844742199]
Old function value is: [1.000000000, .9999999998]
Point 7 is: [.5466338690, .4844742200]
Old function value is: [.9999999999, .9999999997]
Point 8 is: [.5466338691, .4844742202]
Old function value is: [.9999999997, 1.000000001]
Point 9 is: [.5466338690, .4844742196]
Old function value is: [1.000000000, .9999999990]
Point 10 is: [.5466338692, .4844742201]
We can look at the sequence of points we are producing:
> add_an_o := u -> [op(convert(u,list)),`o`];
> add_an_o(p[1]);;
> points := [seq(convert(p[i],list),i=1..10)]:
> points_with_labels:= map(add_an_o,points):
> textplot(points_with_labels);
>
> textplot(points_with_labels,view=[0.52 .. 0.55, 0.44 .. 0.49]);
> textplot(points_with_labels,view=[0.546 .. 0.547, 0.484 .. 0.485]);
>
We can also look at the distance from p[i] to our estimated root, say p[Num_Steps].
>
for i from 0 to Num_Steps -1 do
vec[i] := evalm(p[i]-p[Num_Steps]):
d[i] := sqrt(dotprod(vec[i],vec[i])):
od:
The vertical axis is the distance of p[i] from the expected root.
> plot([seq([i,d[i]],i=0..Num_Steps-1)],labels=["i","d"],thickness=2);
This is more interesting when we look at the (natural) logarithm of the distance.
> i := 'i';
undo previous assignments to i
> plot([seq([i,ln(d[i])],i=0..Num_Steps-1)],labels=["i","ln(d)"],thickness=2);
Can you see quadratic convergence in this picture ?
Things level off after i = 4 because the default accuracy is 10 digits.
It is fun to redo this with more accuracy, e.g. 50.
> Digits;
> Digits:= 50;
> Num_Steps := 10;
Repeating the earlier computations to have them in one place.
> F := [x^3 - 2*x*y^2 + 2*x, x^2*y -y^3 +2*y];
# The derivative of F.
> DF := jacobian(F,[x,y]);
> p[0] := [1,1];
> q := [1,1];
The workhorse loop:
(Changing od: to od; would print more output. There is also a simpler to
use print command for unformattet output.)
>
for i from 1 to Num_Steps do
F_val := map(subs2,F,x=p[i-1][1],y=p[i-1][2]);
printf(`Old function value is: %a \n`,F_val);
DF_val := map(evalf,map(subs2,DF,x=p[i-1][1],y=p[i-1][2]));
p[i] := evalm(p[i-1]+ DF_val^(-1) &* (q- F_val));
printf(`\tPoint %d is: %a \n`,i,convert(p[i],list));
printf(`\n`);
od:
Old function value is: [1, 2]
Point 1 is: [.50000000000000000000000000000000000000000000000000, .62500000000000000000000000000000000000000000000000]
Old function value is: [.73437500000000000000000000000000000000000000000000, 1.1621093750000000000000000000000000000000000000000]
Point 2 is: [.52883806961493189843618631242643349588027576929544, .45791995964351774003699344207163275601143433664033]
Old function value is: [.98379128671713082338257206299592831572466752982600, .94788470675260389701710797809733824868694639608340]
Point 3 is: [.54614401432875539984496762416796908421192803883056, .48441550679366712803660859351855396850370017074133]
Old function value is: [.99887365474202153820453970199560700611585405581127, .99964705817478857214725329627826969283662232765578]
Point 4 is: [.54663402347646002694182482875519917880435516684255, .48447425839478414761499505515823763394694548328260]
Old function value is: [1.0000003339631815314546048382792146619822694492716, 1.0000001427558508967352530325442049928398579235871]
Point 5 is: [.54663386916963587282470124003511834664415723898141, .48447422012609049019554916513377337472956049039027]
Old function value is: [1.0000000000000260024434559781611233132498940645060, 1.0000000000000158629722653710968376441094364905172]
Point 6 is: [.54663386916962272344392053596729498871271491654691, .48447422012608491016263556919540380610739113698221]
Old function value is: [1.0000000000000000000000000001073171544014677712035, 1.0000000000000000000000000001187311229259298598796]
Point 7 is: [.54663386916962272344392053590029259729095452309377, .48447422012608491016263556914320303040414417997173]
Old function value is: [.99999999999999999999999999999999999999999999999994, 1.0000000000000000000000000000000000000000000000000]
Point 8 is: [.54663386916962272344392053590029259729095452309379, .48447422012608491016263556914320303040414417997172]
Old function value is: [1.0000000000000000000000000000000000000000000000001, .99999999999999999999999999999999999999999999999993]
Point 9 is: [.54663386916962272344392053590029259729095452309377, .48447422012608491016263556914320303040414417997177]
Old function value is: [.99999999999999999999999999999999999999999999999990, 1.0000000000000000000000000000000000000000000000000]
Point 10 is: [.54663386916962272344392053590029259729095452309381, .48447422012608491016263556914320303040414417997176]
>
We can also look at the distance from p[i] to our estimated root, say p[Num_Steps].
>
for i from 0 to Num_Steps -1 do
vec[i] := evalm(p[i]-p[Num_Steps]):
d[i] := sqrt(dotprod(vec[i],vec[i])):
od:
>
plot([seq([i,ln(d[i])],i=0..Num_Steps-1)],labels=["i","ln(d)"],thickness=2);
>
>
>
>
>