Acceso abierto

After notes on Chebyshev’s iterative method


Cite

Introduction

One of the most classical problems in Numerical Analysis is the approximation of the zeros of a given function f, that is, finding the values x* for which f(x*) = 0.

In order to approximate these equations we can use iterative methods. The most used scheme is the second order Newton method,

xn+1=xnf(xn)f(xn).$$\begin{array}{} \displaystyle x_{n+1}=x_{n}-\frac{f(x_{n})} {f^{\prime} (x_{n})}. \end{array}$$

This paper is devoted to the analysis of Chebyshev’s method that is a third order extension of Newton’s method. We present the geometric interpretation of the method and its global convergence. We introduce its extension to Banach spaces and review some applications where this high order method is a good alternative to Newton’s method.

Chebyshev is one of the famous mathematicians of the nineteenth century, creator of several mathematical schools in Russia: number theory, probability theory, function approximation theory, theory of mechanisms and machines, etc. He received primary education at home. His mother, taught him to read and write, while his cousin taught arithmetic and the French language, which will be very useful in his relation with Europe. He followed also completing his secondary education at home, but having as tutor in mathematics Prof. Pogorelsky, known in his day as the best teacher of elementary mathematics in Moscow. Prof. Brashman was who practically ran the university studies of Chebyshev ended in 1841. The department of physics and mathematics in which Chebyshev studied convened an award in the 1840-41 course. Chebyshev submitted a paper on the calculation of roots of equations, using the series expansion of the inverse function. The work, unpublished at the time, was rewarded with the silver medal. Chebyshev worked as a professor at the University of St. Petersburg for 35 years. He is recognized as the founder of the mathematical school in St. Petersburg whose echo and influence has reached our time in many branches of mathematics. This school was distinguished by the tendency to relate the theoretical problems of mathematics with problems in the art and nature.

Geometry, dynamics and convergence of Chebyshev’s method in the scalar case

The geometric interpretation of Newton’s method is well known, given an iterate xn, the next iterate is the zero of the tangent line

y(x)f(xn)=f(xn)(xxn),$$\begin{array}{} \displaystyle y(x)-f(x_n)=f'(x_n)(x-x_n), \end{array}$$

to the graph of f at (xn, f(xn)).

The following well-known theorem, giving enough conditions for the global convergence of Newton’s method, follows easily from its geometric interpretation.

Theorem 1

Let fbe continuous on an interval J, containing a root x*of f ; let f′ ≠ 0 and f″ ≥ 0 or f″ ≤ 0 on J. Then Newton’s method converges monotonically to x*from any point x0 ε J such that f(x0) f″(x0) ≥ 0.

Chebyshev’s method is obtained by quadratic interpolation of the inverse function of f, in order to approximate f−1(0), [22]. But it also admits a geometric derivation, from a parabola in the form

ay(x)2+y(x)+bx+c=0,$$\begin{array}{} \displaystyle a y(x)^2+y(x)+bx+c=0, \end{array}$$

that after the imposition of the super tangency conditions, y(xn) = f(xn), y′(xn) = f′ (xn) and y″(xn) = f″(xn), can be written as

f(xn)2f(xn)2(y(x)f(xn))2+y(x)f(xn)f(xn)(xxn)=0.$$\begin{array}{} \displaystyle -\frac{f''(x_n)}{2f'(x_n)^2}(y(x)-f(x_n))^2+y(x)-f(x_n)-f'(x_n) (x-x_n)=0. \end{array}$$

By calculating the intersection of this parabola with the OX-axis we obtain the next step of Chebyshev’s method:

xn+1=xn(1+12Lf(xn))f(xn)f(xn),$$\begin{array}{} \displaystyle x_{n+1}=x_n- \left(1+{1\over 2} L_f(x_n)\right) {f(x_n)\over f'(x_n)}, \end{array}$$

where Lf(xn)=f(xn)f(xn)(f(xn))2$\begin{array}{} \displaystyle L_f(x_n)=\frac{f(x_n) f''(x_n)}{(f'(x_n))^2} \end{array}$.

We refer to [1] for the geometric interpretation of other third order methods.

By using the geometric interpretation of Chebyshev’s method, we can obtain the following global convergence theorem.

Theorem 2

Let f″′ be continuous on an interval J containing a root x*of f, let f′ ≠ 0, Lf(x) > −2 and((ηf'(x))2)''0$\begin{array}{} \displaystyle {({(\frac{\eta }{{f'(x)}})^2})^{''}} \ge 0 \end{array}$in J, with η = sgn(f′). Then Chebyshev’s method converges monotonically to x*from any point of the interval.

Proof. We suppose f′ > 0 (for f′ < 0 the proof is similar).

First, we begin from a point on the left of x*, x¯x*$\begin{array}{} \displaystyle \overline{x} \leq x^* \end{array}$.

We would like to show that the intersection x^$\begin{array}{} \displaystyle \widehat{x} \end{array}$ of the parabola y(x) given in (2) with the OX-axis will be in [x¯,x*]$\begin{array}{} \displaystyle [\overline{x},x^*] \end{array}$. By hypothesis Lf(x¯)>2$\begin{array}{} \displaystyle L_f(\overline{x} )>-2 \end{array}$, in particular, x¯x^$\begin{array}{} \displaystyle \overline{x} \leq \widehat{x} \end{array}$.

Thus, it will be enough, if for xx¯$\begin{array}{} \displaystyle x\geq \overline{x} \end{array}$, we can prove that

y(x)=1+14a(bx+c)2af(x).$$\begin{array}{} \displaystyle y(x)=\frac{-1+\sqrt{1-4a(bx+c)}}{2a} \geq f\left (x\right). \end{array}$$

In this case, we will obtain a monotonic increasing sequence, bounded from above by x*, then it converges at the limit γx*. So, because the construction of the method and the continuity of f the convergence is obtained, γ = x*.

Inequality (3) is equivalent to

1+14a(bx+c)2a1+14a(bx¯+c)2af(x)f(x¯),$$\begin{array}{} \displaystyle \frac{-1+\sqrt{1-4a(bx+c)}}{2a}- \frac{-1+\sqrt{1-4a(b\overline{x}+c)}}{2a} \geq f\left(x\right) -f\left (\overline{x}\right), \end{array}$$

or

x¯xb14a(bt+c)dtx¯xf(t)dt.$$\begin{array}{} \displaystyle \int\nolimits_{\overline{x}} ^ {x} \frac{-b}{\sqrt{1-4a(bt+c)}} dt\geq \int\nolimits_{\overline{x}} ^ {x}f^{\prime} \left (t\right) dt. \end{array}$$

As f′ > 0 then for hypothesis ((1f')2)''0$\begin{array}{} \displaystyle {\left( {{{\left( {\frac{1}{{f'}}} \right)}^2}} \right)^{''}} \ge 0 \end{array}$ in J, i.e., (1f')2$\begin{array}{} {\left( {\frac{1}{{f'}}} \right)^2} \end{array}$ is convex, and therefore

(1f'(x))214a(bx+c)(b)2,$$\begin{array}{} \displaystyle {\left( {\frac{1}{{f'\left( x \right)}}} \right)^2} \ge \frac{{1 - 4a(bx + c)}}{{{{( - b)}^2}}}, \end{array}$$

because 14a(bx+c)(b)2$\begin{array}{} \displaystyle \frac{{1 - 4a(bx + c)}}{{{{( - b)}^2}}} \end{array}$ approximates (1f'(x))2$\begin{array}{} {\left( {\frac{1}{{f'\left( x \right)}}} \right)^2} \end{array}$ up to second order.

Thus,

b14a(bx+c)f'(x)>0,$$\begin{array}{} \displaystyle \frac{-b}{\sqrt{1-4a(bx+c)}} \geq f^{'}(x) >0, \end{array}$$

and consequently the relation (4) holds.

Finally, if we begin from a point on the right of the root, we will obtain 1+14a(bx+c)2af(x)$\begin{array}{} \displaystyle \frac{-1+\sqrt{1-4a(bx+c)}}{2a} \leq f\left (x\right) \end{array}$ and the convergence will be monotonic from the right.

We refer to [3] for the global convergence of other third order schemes and some comparisons.

In general, the method has not global convergence. We only are able in these cases to ask for local or semilocal convergence. Around the solutions we looking for regions of convergence.

Consider for instance the problem to find the zeros of a polynomial, p(z) = 0. Let R(z)=P(z)Q(z)$\begin{array}{} R(z) = \frac{{P(z)}}{{Q(z)}} \end{array}$, where P(z) and Q(z) are complex polynomials with no common factors, be a rational map on the Riemann sphere. We say that z0 is a fixed point of R(z) if R(z0) = z0. For z¯$\begin{array}{} \displaystyle \,z \in \overline{\mathbb{C}}\, \end{array}$ we define its orbit as the set orb(z) = {z,R(z),R2(z),...,Rk(z),...}, where Rk means the k –fold iterate of R. A periodic point of period n is a point z0 such that Rn(z0) = z0 and Rj(z0) ≠ z0 for 0 < j < n. Observe that if z¯$\begin{array}{} \displaystyle \,z \in \overline{\mathbb{C}}\, \end{array}$ is a periodic point of period n ≥ 1, then z0 is a fixed point of Rn . Also, recall that a fixed point z0 is respectively attracting, repelling or indifferent in case |R′(z0)| is less than, greater than or equal to 1. A periodic point of period n is said to be attracting, repelling or indifferent if as a fixed point of Rn(z) is respectively attracting, repelling or indifferent. A superattracting fixed point of R(z) is a fixed point which is also a zero of the derivative R′(z). A periodic point of period n is said to be a superattracting periodic point of R(z) if, as a fixed point of Rn(z), is superattracting.

Let ζ be an attracting fixed point of R(z). The basin of attraction of ζ is the set B(ζ)={z¯:Rn(z)ζ$\begin{array}{} B\left( \zeta \right) = \{ z \in \overline {\Bbb C} {\text{ : }}{R^n}\left( z \right) \to \zeta \} \end{array}$ as n → ∞}. The immediate basin of attraction of an attracting fixed point ζ of R(z), denoted by B*(ζ), is the connected component of B(ζ) containing ζ . Finally, if z0 is an attracting periodic point of period n of R(z), the basin of attraction of the orbit orb(z0) is the set B(orb(z0))=j=0n1Rj(B(z0))$\begin{array}{} \displaystyle {\mkern 1mu} B({\rm{orb}}({z_0})) = \cup _{j = 0}^{n - 1}{R^j}(B({z_0})){\mkern 1mu} \end{array}$, where B(z0) is the attraction basin of z0 as a fixed point of Rn . The Julia set of a rational map R(z), denoted by J(R)$\begin{array}{} \displaystyle {\cal J}(R) \end{array}$, is the closure of the set of repelling periodic points. Its complement is the Fatou set(R)$\begin{array}{} \displaystyle {\cal F}(R) \end{array}$. If R(z) has an attracting fixed point z0, then the basin of attraction B(z0) is contained in the Fatou set and J(R)=B(z0)$\begin{array}{} \displaystyle J(R) = \partial B({z_0}){\mkern 1mu} \end{array}$. Therefore, the chaotic dynamics of R(z) is contained in its Julia set.

The iterative rational function for Chebyshev’s method is given by

Chp(z)=z(1+12Lp(z))p(z)p(z).$$\begin{array}{} \displaystyle Ch_p(z)=z-\left( 1+ \frac{1}{2}\, L_p(z)\right)\frac{p(z)}{p'(z)}. \end{array}$$

We recall the definition of conjugacy.

Definition 1

Let R1,R2 : ¯¯$\overline{\Bbb{C}}\to \overline{\Bbb{C}}$ be two rational maps. We say that R1 and R2 are conjugated if there exists a Möbius transformation φ : ¯¯$\overline{\Bbb{C}}\to \overline{\Bbb{C}}$ such that R2 ∘ φ(z) = φ ∘ R1(z) for all z.

An important feature of conjugation of rational maps is given by the following classical result.

Theorem 3

Let R1and R2be two rational maps and let φ be a Möbius transformation conjugating R1and R2, that is, R2 = φ ∘ R1 ∘ φ−1 . ThenF(R2)=Φ(F(R1))$\begin{array}{} \displaystyle {\cal F}({R_2}) = \phi ({\cal F}({R_1})) \end{array}$andJ(R2)=Φ(J(R1))$\begin{array}{} \displaystyle {\cal J}({R_2}) = \phi ({\cal J}({R_1})) \end{array}$.

Conjugacy plays a central rôle in understanding the behavior of classes of maps from the point of view of dynamical systems in the following sense. Suppose that one wishes to describe the quantitative as well as the qualitative behavior of the map z → Φf (z), where Φf (z) is some iterative function. Since conjugacy preserves fixed and periodic points and their type as well as attraction basins, the dynamical data concerning f is carried by the fixed points of Φf (z) as well as by the nature of such fixed points which may be (super)attracting, repelling or indifferent. Therefore, it is worthwhile to build up, for polynomials of degree two and three, a parametrized family consisting of polynomials as simple as possible such that a conjugacy exists between the corresponding iterative functions.

The result that follows, due to A. Cayley [8,21], has great historical importance. In an attempt to understand the dynamics of Newton’s method in the complex plane, Cayley investigated the dynamics of Newton’s method applied to polynomials of a particularly simple form. He realized that major difficulties would arise when attempting to extend the following result for quadratics to cubics and beyond. It is believed that this circumstance motivated further work of P. Fatou and G. Julia along these lines.

Theorem 4

LetNp(z)=z2ab2z(b+a)$\begin{array}{} \displaystyle \,N_p(z)={\displaystyle \frac { z^{2} - a\,b}{ 2\,z -( b + a)}}\, \end{array}$be the rational map obtained from Newton’s method applied to a generic quadratic polynomial p(z) = (za)(zb). Then Np is conjugated to the map zz2by the Möbius transformationM(z)=zazb$\begin{array}{} \displaystyle \,M(z)=\frac{z-a}{z-b}\, \end{array}$andJ(Np)$\begin{array}{} \displaystyle {\cal J}({N_p}) \end{array}$is the straight line in the complex plane corresponding to the locus of points equidistant from a and b.

For Chebyshev’s method, also known as super–Newton method, the following result holds.

Theorem 5

Chp(z)=3z42(a+b)z36abz2+6ab(a+b)zab(a2+3ab+b2)(2zab)3$$\begin{array}{} \displaystyle Ch_p(z)=\displaystyle \frac { 3z^4-2(a+b)z^3-6abz^2+6ab(a+b)z-ab(a^2+3ab+b^2)}{(2z-a-b)^3} \end{array}$$

be the rational map obtained from Chebyshev’s method applied to a generic quadratic polynomial p(z) = (za)(zb). Then Chp(z) is conjugated to the mapS3(z)=z4+2z32z+1$\begin{array}{} \displaystyle \, S_3(z)=\frac{z^4+2z^3}{2z+1}\, \end{array}$via the Möbius transformationM(z)=zazb$\begin{array}{} \displaystyle \,M(z)=\frac{z-a}{z-b}\, \end{array}$.

For cubic polynomials we have the following result.

Theorem 6

Let p(z) = (zz0)(zz1)(zz2) be a generic cubic polynomial with roots ordered as follows: 0 ≤ |z0| ≤ |z1| ≤ |z2|. Let T (z) = (z2z0)z + z0. Then

p(z) reduces to a polynomial belonging to the parametrized family qλ,ρ(z) = pT (z) = λ3z(z − 1)(zρ) , where λ = z2z0andρ=z1z0z2z0$\begin{array}{} \displaystyle \,\rho=\frac{z_1-z_0}{z_2-z_0} \end{array}$,

T is a conjugacy between Chp andChqλ,ρ$\begin{array}{} \displaystyle \,Ch_{q_{\lambda,\rho}}\, \end{array}$, that is, T1°Chqλ,ρ°T=Chp$\begin{array}{} \displaystyle \,T^{-1}\circ Ch_{q_{\lambda,\rho}}\circ T= Ch_{p}\, \end{array}$, and

ifq˜ρ(z)=z(z1)(zρ)$\begin{array}{} \displaystyle \,\tilde{q}_{\rho}(z)=z (z-1) (z-\rho)\, \end{array}$, thenChqλ,ρ$\begin{array}{} \displaystyle \,Ch_{q_{\lambda,\rho}}\, \end{array}$is conjugated toChq˜ρ$\begin{array}{} \displaystyle \,Ch_{\tilde{q}_{\rho}}\, \end{array}$. Consequently, Chp is conjugated toChq˜ρ$\begin{array}{} \displaystyle \,Ch_{\tilde{q}_{\rho}}\, \end{array}$.

It is possible to see that the one–parameter family q˜ρ(z)=z(zρ)(z1)$\begin{array}{} \displaystyle \, \tilde{q}_\rho(z) =z (z-\rho) (z-1) \, \end{array}$ reduces to the well–known parameter family pA(z) = z3 + (A − 1)zA (see [10]).

In the two pictures that follow we show the attraction basins of Newton’s and Chebyshev’s iterations applied to z3 − 1 = 0. we can see a more rich dynamics in Chebyshev’s method.

Fig. 1

Basins of attraction for the polynomial p(z) = z3 − 1. Newton’s method.

Fig. 2

Basins of attraction for the polynomial p(z) = z3 − 1. Chebyshev’s method.

In [4], we study the dynamics of a family of third-order iterative algorithms which includes Chebyshev’s iteration function, Halley’s iterative method, super-Halley’s iterative method and the c -iterative methods. Using results of conjugation of rational maps and the definition of the universal Julia set we find the conjugacy classes of these iterative methods explicitly.

Extension to Banach spaces and some applications

To approximate a solution of the nonlinear equation

F(x)=0,$$\begin{array}{} \displaystyle F\left( x\right) =0, \end{array}$$

F : XY, X, Y Banach spaces, Chebyshev’s method can be written as

xn+1=xn(I+12LF(xn))F(xn)1F(xn),$$\begin{array}{} \displaystyle x_{n+1}=x_{n}-\left( I+\frac{1}{2}L_{F}\left(x_{n}\right) \right) F^{\prime}\left( x_{n}\right)^{-1} F\left( x_{n}\right), \end{array}$$

where

LF(xn)=F(xn)1F(xn)F(xn)1F(xn),        β[0,1].$$\begin{array}{} \displaystyle \begin{array}{*{20}{l}} {{L_F}\left( {{x_n}} \right) = {F^\prime }{{\left( {{x_n}} \right)}^{ - 1}}{F^{\prime \prime }}\left( {{x_n}} \right){F^\prime }{{\left( {{x_n}} \right)}^{ - 1}}F\left( {{x_n}} \right),}\\ {{\rm{ }}\beta \in [0,1].} \end{array} \end{array}$$

Next, we review some applications where Chebyshev’s method can be considered a good alternative to Newton’s method.

Quadratic equations

In this case, F″(x) is a constant bilinear operator that we denote by B.

Using Taylor expansions,

F(yn)=F(xn)+F'(xn)(ynxn)+12F''(xn)(ynxn)2=12B(ynxn)2.$$\begin{array}{} \displaystyle F(y_n)=F(x_n)+F^{'}(x_n)(y_n-x_n)+\frac{1}{2}F^{''}(x_n)(y_n-x_n)^2 =\frac{1}{2} B(y_n-x_n)^2. \end{array}$$

Thus, F'(xn)1F(yn)=12LF(xn)(ynxn)$\begin{array}{} \displaystyle F^{'}(x_n)^{-1} F(y_n)=-\frac{1}{2}L_{F}(x_n)(y_n-x_n) \end{array}$, and the method becomes

yn=xnF'(xn)1F(xn),xn+1=ynF'(xn)1F(yn),$$\begin{array}{} \displaystyle \begin{array}{*{20}{r}} {{y_n}}& = &{{x_n} - F'{{({x_n})}^{ - 1}}F({x_n}),}\\ {{x_{n + 1}}}& = &{{y_n} - F'{{({x_n})}^{ - 1}}F({y_n}),} \end{array} \end{array}$$

equivalently,

F'(xn)(ynxn)=F(xn),F'(xn)(xn+1yn)=F(yn).$$\begin{array}{} \displaystyle \begin{array}{*{20}{r}} {F'\left( {{x_n}} \right)\left( {{y_n} - {x_n}} \right)}& = &{ - F({x_n}),}\\ {F'\left( {{x_n}} \right)\left( {{x_{n + 1}} - {y_n}} \right)}& = &{ - F({y_n}).} \end{array} \end{array}$$

Notice that we only need a LU decomposition and three evaluations (F(xn), F(yn) and F′(xn)) in each iteration. In particular, for this problem Chebyshev’s method is more efficient than Newton’s method.

Moreover, we can obtain a very simple semilocal convergence:

Theorem 7

Given x0such that there exists F′(x0)−1and condition

||F'(x0)1B||||F'(x0)1F(x0)||12$$\begin{array}{} \displaystyle ||F{\rm{'}}{({x_0})^{ - 1}}B||\;||F{\rm{'}}{({x_0})^{ - 1}}F({x_0})|| \le \frac{1}{2} \end{array}$$

holds, then Chebyshev’s method is well defined and converges to x*, solution of F(x) = 0.

We refer [12] and its references for a general convergence analysis of this type of methods.

Let us consider the equation

F(x)=xTBx+Cx+D=0,$$\begin{array}{} \displaystyle \label{ricatti} F\left( x \right)=x^T B x + C x + D = 0, \end{array}$$

where

B=rand(N*N*N),C=rand(N*N),$$\begin{array}{} \displaystyle \begin{array}{*{20}{l}} B& = &{rand\left( {N*N*N} \right),}\\ C& = &{rand\left( {N*N} \right),} \end{array} \end{array}$$

and D is compute in order to obtain xi*=1$\begin{array}{} \displaystyle x_i^* = 1 \end{array}$, i = 1,...,N, as solution.

In table 1 we consider N = 30. The Chebyshev’s method has third order of convergence.

Error max-norm, x0 such that ||x*x0|| = 0.1

IterationCheby.
18.41e − 03
21.78e − 09
32.90e − 13
40.00e + 00

For quadratic equations we can find efficient higher order methods [2].

The inverse of a matrix

Approximating inverse operators is a very common task in several areas of interest, such as physics, chemistry, engineering, etc. In a general context, we can formulate the following problem: given g, we are interested in calculating f ε Ω such that H(f) = g, where H : Ω → Y is an operator defined in a domain Ω of a Banach space X with values in a Banach space Y . It is clear that we have to calculate or approximate the inverse operator H−1 for solving the previous equation. In this case, if g is in the domain of H−1, there is a solution f = H−1(g).

To approximate the inverse operator H−1, we use Newton-type methods, so that better successive approximations to H−1 are constructed from an initial approximation. The methods considered in this paper, when they are applied to compute an inverse operator, are not be based on solving linear systems, if not on the product of matrices. Notice that the formulation of the problem in this manner is very interesting.

Let X and Y be two Banach spaces and GL(X,Y)={H(X,Y):H1exists}$\begin{array}{} \displaystyle GL(X,Y)=\{H\in\mathcal{L}(X,Y):\, H^{-1}\,\text{exists}\} \end{array}$, where (X,Y)$\begin{array}{} \displaystyle \mathcal{L}(X,Y) \end{array}$ is the set of bounded linear operators from the Banach space X into the Banach space Y . The problem that we are thinking about is the following: given an operator H ε GL(X,Y), we approximate H−1. To do this, we first consider

:GL(Y,X)(X,Y)and(G)=G1H,$$\begin{array}{} \displaystyle \mathcal{F}:GL(Y,X)\rightarrow\mathcal{L}(X,Y) \quad\text{and}\quad \mathcal{F}(G)=G^{-1}-H, \end{array}$$

so that H−1 is the solution of the equation (G)=0$\begin{array}{} \displaystyle \mathcal{F}(G)=0 \end{array}$.

If we observe Newton’s method,

{G0given,Gn+1=Gn[(Gn)]1(Gn),n0,$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = {G_n} - {{[{\cal F}'({G_n})]}^{ - 1}}{\cal F}({G_n}),\quad n \ge 0,} \end{array}\right. \end{array}$$

it is clear that inverse operators are used, but if we take into account the definition of $\begin{array}{} \displaystyle \mathcal{F} \end{array}$, calculate (Gn)$\begin{array}{} \displaystyle \mathcal{F}'(G_n) \end{array}$ and do

(Gn)(Gn+1Gn)=(Gn),n0,$$\begin{array}{} \displaystyle \mathcal{F}'(G_n)(G_{n+1}-G_n)=-\mathcal{F}(G_n),\quad n\geq 0, \end{array}$$

then we can avoid the use of inverse operators for approximating Gn+1.

Indeed, to obtain the corresponding algorithm, we only need to compute (Gn)$\begin{array}{} \displaystyle \mathcal{F}'(G_n) \end{array}$. So, given G ε GL(Y,X), as G−1 exists, if

0<ε<1αG1,$$\begin{array}{} \displaystyle 0 < \varepsilon < \dfrac{1}{\|\alpha\|\|G^{-1}\|}, \end{array}$$

we have εα<1G1$\begin{array}{} \displaystyle \|\varepsilon\alpha\| < \frac{1}{\|G^{-1}\|} \end{array}$ for α ε GL(Y,X). Therefore, it is known that G + εα ε GL(Y,X) and then

(G)α=limε01ε[(G+εα)(G)]=G1αG1.$$\begin{array}{} \displaystyle \mathcal{F}'(G)\alpha= \lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon} [\mathcal{F}(G+\varepsilon\alpha)-\mathcal{F}(G)]= -G^{-1}\alpha G^{-1}. \end{array}$$

In consequence, Newton’s method is now given by the following algorithm:

{G0given,Gn+1=2GnGnHGn,n0.$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = 2{G_n} - {G_n}H{G_n},\quad n \ge 0.} \end{array}\right. \end{array}$$

In addition, Newton’s method does not use inverse operators for approximating an inverse operator and the order of convergence is two.

If we now consider Chebyshev’s method,

{G0given,Gn+1=Gn[I+12L(Gn)][(Gn)]1(Gn),n0,$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = {G_n} - \left[ {I + \frac{1}{2}{L_{\cal F}}({G_n})} \right]{{[{\cal F}'({G_n})]}^{ - 1}}{\cal F}({G_n}),\quad n \ge 0,} \end{array}\right. \end{array}$$

where L(Gn)=[(Gn)]1(Gn)[(Gn)]1(Gn)$\begin{array}{} \displaystyle L_{\mathcal{F}}(G_n)=[\mathcal{F}'(G_n)]^{-1}\mathcal{F}''(G_n)[\mathcal{F}'(G_n)]^{-1}\mathcal{F}(G_n) \end{array}$, we can think that inverse operators are used, but we can do the same as in Newton’s method to see that Chebyshev’s method does not use them:

{(Gn)(PnGn)=(Gn),n0,(Gn)(Gn+1Pn)=12(Gn)(PnGn)2,$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{\cal F}'({G_n})({P_n} - {G_n}) = - {\cal F}({G_n}),\quad n \ge 0,}\\ {{\cal F}'({G_n})({G_{n + 1}} - {P_n}) = - \frac{1}{2}{\cal F}''({G_n}){{({P_n} - {G_n})}^2},} \end{array}\right. \end{array}$$

so that we can also avoid the use of inverse operators for approximating Gn+1.

Let α,β ε GL(Y,X) and then

0<ε<1βG1,$$\begin{array}{} \displaystyle 0 < \varepsilon < \dfrac{1}{\|\beta\|\|G^{-1}\|}, \end{array}$$

so that G + εβ ε GL(Y,X) and

(G)αβ=limε01ε[(G+εβ)α(G)α]=G1αG1βG1+G1βG1αG1.$$\begin{array}{} \displaystyle \mathcal{F}''(G)\alpha\beta= \lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon} [\mathcal{F}'(G+\varepsilon\beta)\alpha-\mathcal{F}'(G)\alpha]= G^{-1}\alpha G^{-1}\beta G^{-1} + G^{-1}\beta G^{-1}\alpha G^{-1}. \end{array}$$

In consequence, we write Chebyshev’s method as

{G0given,Gn+1=3Gn3GnHGn+GnHGnHGn,n0.$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = 3{G_n} - 3{G_n}H{G_n} + {G_n}H{G_n}H{G_n},\quad n \ge 0.} \end{array}\right. \end{array}$$

Observe that Chebyshev’s method does not use inverse operators for approximating an inverse operator and the order of convergence is three.

Theorem 8

If ||IHG0|| < 1, Newton and Chebyshev iterative methods are convergent. Moreover, if HG0 = G0H, then limn→∞Gn = H−1.

On the other hand, if we observe Newton’s and Chebyshev’s methods, two well-known iterative methods, we conclude that they can approximate inverse operators without using any inverse operator in their application. From this feature of both methods, we pay attention in [5] to the construction of iterative methods of any prefixed order of convergence. Moreover, the final formulation of the methods uses only potencies of matrices that are close to the identity. This fact is crucial to avoid any stability problem in the implementation of the methods. The best method of the family will depend on the particular problem to solve. In any case, we find iterative methods with better behavior than Newton’s method. Finally, we would like to emphasize that in the applications, where we are interested in the computation of an inverse operator, our methods use matrix-matrix multiplications with a computational cost similar to that of Gauss-type methods. But, if we are interested only in the application of the inverse operator, we will be able to implement our methods using only matrix-vector multiplications, so that we reduce considerably the computational cost.

The pth root of a matrix

If we consider the complex equation f(x) = xpa with a$\begin{array}{} \displaystyle a\in\mathbb{C} \end{array}$, then Lf(x)=(p1p)xpaxp$\begin{array}{} \displaystyle L_{f}(x)=\left(\dfrac{p-1}{p}\right)\dfrac{x^p-a}{x^p} \end{array}$ and Chebyshev’s method is reduced to

x0D,    xn+1=2p23p+12p2xn+2p1p2axn1pp12p2a2xn12p,n0.$$\begin{array}{} \displaystyle {x_0} \in D,{\rm{ }}{x_{n + 1}} = \frac{{2{p^2} - 3p + 1}}{{2{p^2}}}{\mkern 1mu} {x_n} + \frac{{2p - 1}}{{{p^2}}}{\mkern 1mu} a{\mkern 1mu} x_n^{1 - p} - \frac{{p - 1}}{{2{p^2}}}{\mkern 1mu} {a^2}{\mkern 1mu} x_n^{1 - 2p},\quad n \ge 0. \end{array}$$

If we extend Chebyshev’s method for the computation of the pth root of a matrix, we then consider the space

Θ={Ar×r s.tA has no nonpositive real eigenvalues}$$\begin{array}{} \displaystyle \Theta = \left\{A \in \mathbb{C}^{r\times r}\,\text{ s.t. } A \text{ has no nonpositive real eigenvalues} \right\} \end{array}$$

and approximate A1p$\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$ for a given matrix X ε Θ. For this, we consider

:Θr×rand(X)=XpA,$$\begin{array}{} \displaystyle \mathcal{F}:\Theta\rightarrow\mathbb{C}^{r \times r} \quad\text{and}\quad \mathcal{F}(X)=X^{p}-A, \end{array}$$

so that A1p$\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$ is the solution of the equation (X)=0$\begin{array}{} \displaystyle \mathcal{F}(X)=0 \end{array}$. So, Chebyshev’s method is reduced to

X0Θ,Xn+1=2p23p+12p2Xn+2p1p2AXn1pp12p2A2Xn12p,n0.$$\begin{array}{} \displaystyle X_0\in\Theta,\quad X_{n+1} = \dfrac{2p^2-3p+1}{2p^2}\,X_n + \dfrac{2p-1}{p^2}\, A X_n^{1-p} - \dfrac{p-1}{2p^2}\,A^2 X_n^{1-2p},\quad n\geq 0. \end{array}$$

Chebyshev’s method, as Newton’s method, cannot be used directly to approximate the principal pth root. Using the same idea as in [15], one can prove that it is not stable in a neighborhood of A1p$\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$. A small perturbation on the value of Xn is amplified in the following steps and in a finite arithmetic computation the algorithm diverges. This problem can be overcome using another algorithm which provides the same sequence but it is stable in a neighborhood of A1p$\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$.

The following iteration, given by Denman and Beavers in [9],

{X0=A,Y0=I,Xn+1=12(Xn+Yn1),Nn+1=12(Yn+Xn1),n0.$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = A,\quad {Y_0} = I,}\\ {{X_{n + 1}} = \frac{1}{2}({X_n} + Y_n^{ - 1}),}\\ {{N_{n + 1}} = \frac{1}{2}({Y_n} + X_n^{ - 1}),\quad n \ge 0.} \end{array}\right. \end{array}$$

is a stable variant of the Newton method for the matrix square root.

The instability of the simplified Newton iterations Xn+1=(Xn+AXn1)/2$\begin{array}{} \displaystyle X_{n+1}=(X_n+AX_n^{-1})/2 \end{array}$ and Xn+1=(Xn+Xn1A)/2$\begin{array}{} \displaystyle X_{n+1}=(X_n+X_n^{-1} A)/2 \end{array}$, shown by Higham [15] for the matrix square root, is mainly due to the one-sided multiplication by A. On the other hand, since Xn and A commute if AX0 = X0A, the iteration can be rewritten as

Xn+1=Xn+A12Xn1A122,n0,$$\begin{array}{} \displaystyle X_{n+1}=\frac{X_n+ A^\frac{1}{2} X_n^{-1} A^\frac{1}{2}}{2}, \quad n\geq 0, \end{array}$$

and the iteration becomes stable in this form. However, it is useless since it involves the square root of A, but it helps us to stabilize the iteration by introducing the variable Yn=A12Xn1A12=A1Xn=XnA1$\begin{array}{} \displaystyle Y_n= A^\frac{1}{2} X_n^{-1} A^\frac{1}{2}=A^{-1}X_n=X_nA^{-1} \end{array}$. The resulting iteration is that of Denman and Beavers, see [16].

Following these ideas, Iannazo [17] proposes the following two stable versions of the Newton method for the matrix pth root:

Xn+1=(p1)Xn+(A1pXn1)p1A1pp,   n0,$$\begin{array}{} \displaystyle {X_{n + 1}} = \frac{{(p - 1){X_n} + {{\left( {{A^{\frac{1}{p}}}X_n^{ - 1}} \right)}^{p - 1}}{A^{\frac{1}{p}}}}}{p},{\rm{ }}n \ge 0, \end{array}$$

and

{X0=I,N0=A,Xn+1=Xn((p1)I+Nnp),Nn+1=((p1)I+Nnp)pNn,n0.$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = I,\quad {N_0} = A,}\\ {{X_{n + 1}} = {X_n}\left( {\frac{{(p - 1)I + {N_n}}}{p}} \right),}\\ {{N_{n + 1}} = {{\left( {\frac{{(p - 1)I + {N_n}}}{p}} \right)}^{ - p}}{N_n},\quad n \ge 0.} \end{array}\right. \end{array}$$

Observe that in the second case, the matrix A does not explicitly appear in the iteration.

Similarly, for Chebyshev’s method, we propose

Xn+1=(p32+12p)Xn+(21p)(A1pXn1)p1A1pp(p1)(A1pXn1)2p1A1p2p2,n0.$$\begin{array}{} \displaystyle X_{n+1}=\frac{\left(p-\frac{3}{2}+\frac{1}{2p}\right) X_n + \left(2-\frac{1}{p}\right) \left(A^{\frac{1}{p}} X_n^{-1} \right)^{p-1} A^{\frac{1}{p}}}{p} - \frac{ (p-1) \left(A^{\frac{1}{p}} X_n^{-1} \right)^{2p-1} A^{\frac{1}{p}}}{2 p^2},\quad n\geq 0. \end{array}$$

and

{X0=I,N0=A,Xn+1=Xn((p32+12p)I+(21p)Nnp(p1)Nn22p2),n0,Nn+1=((p32+12p)I+(21p)Nnp(p1)Nn22p2)pNn.$$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = I,\quad {N_0} = A,}\\ {{X_{n + 1}} = {X_n}\left( {\frac{{\left( {p - \frac{3}{2} + \frac{1}{{2p}}} \right)I + \left( {2 - \frac{1}{p}} \right){N_n}}}{p} - \frac{{(p - 1)N_n^2}}{{2{p^2}}}} \right),\quad n \ge 0,}\\ {{N_{n + 1}} = {{\left( {\frac{{\left( {p - \frac{3}{2} + \frac{1}{{2p}}} \right)I + \left( {2 - \frac{1}{p}} \right){N_n}}}{p} - \frac{{(p - 1)N_n^2}}{{2{p^2}}}} \right)}^{ - p}}{N_n}.} \end{array}\right. \end{array}$$

Note that (17) is useless, since it involves the matrix pth root of A.

Observe that we need to calculate Xn1$\begin{array}{} \displaystyle X_{n}^{-1} \end{array}$ at each step of Chebychev’s method given by (16). We are interesting in obtaining other expression of Chebyshev’s method which avoids the computation of these inverses. So, if we consider the complex function f(x)=1xp1a=0$\begin{array}{} \displaystyle f(x)=\frac{1}{x^{p}}-\frac{1}{a}=0 \end{array}$, where f:D$\begin{array}{} \displaystyle f: D\subseteq\mathbb{C}\rightarrow\mathbb{C} \end{array}$, then algorithm (17) is reduced to

x0D,  xn+1=xn(1+1p(1xnpa)+p+12p2(1xnpa)2),n0.$$\begin{array}{} \displaystyle x_0\in D,\qquad x_{n+1} = x_n\left(1+\dfrac{1}{p}\left(1-\dfrac{x_n^p}{a}\right) + \dfrac{p+1}{2p^2}\left(1-\dfrac{x_n^p}{a}\right)^2\right),\quad n\geq 0. \end{array}$$

As a consequence, the algorithm of Chebyshev’s method for solving (X)=0$\begin{array}{} \displaystyle \mathcal{F}(X)=0 \end{array}$, where :Θr×r$\begin{array}{} \displaystyle \mathcal{F}:\Theta\rightarrow\mathbb{C}^{r \times r} \end{array}$, is then

X0Θ,Xn+1=Xn(I+1p(IA1Xnp)+p+12p2(IA1Xnp)2),n0.$$\begin{array}{} \displaystyle X_0\in\Theta,\quad X_{n+1} = X_n\left(I+\dfrac{1}{p}\left(I-A^{-1}X_n^p\right) + \dfrac{p+1}{2p^2}\left(I-A^{-1}X_n^p\right)^2\right),\quad n\geq 0. \end{array}$$

Observe that we only need to calculate A−1 and not Xn1$\begin{array}{} \displaystyle X_{n}^{-1} \end{array}$ at each step, so that algorithm (20) is more efficient than algorithm (16).

For Halley’s method, a similar strategy in order to avoid the use of inverses is not possible. Remember that the algorithm of Halley’s method,

x0D,  xn+1=xn(11+12Lf(xn))f(xn)f'(xn),n0,$$\begin{array}{} \displaystyle {x_0} \in D,\qquad {x_{n + 1}} = {x_n} - \left( {\frac{1}{{1 + \frac{1}{2}{L_f}({x_n})}}} \right)\frac{{f({x_n})}}{{f'({x_n})}},\quad n \ge 0, \end{array}$$

always involves inverses (the quotient includes Lf (xn)). This is the main advantage of Chebyshev’s method with respect to Halley’s method.

After that, we analyze the convergence of Chebyshev’s method. For this, we suppose AX0 = X0A and consider the following residual of algorithm (20):

R(Xn)=IA1Xnp,n0.$$\begin{array}{} \displaystyle R(X_n)=I-A^{-1}X_n^p,\quad n\geq 0. \end{array}$$

To prove the convergence of algorithm (20), we consider a submultiplicative matrix norm || ⋅ || defined in r×r$\begin{array}{} \displaystyle \mathbb{C}^{r \times r} \end{array}$ and prove that {||R(Xn)|| is a scalar decreasing sequence convergent to zero. First of all, from AX0 = X0A, it follows AXn = XnA, AXnp=XnpA$\begin{array}{} \displaystyle AX_n^{p}=X_n^{p}A \end{array}$ and A1Xnp=XnpA1$\begin{array}{} \displaystyle A^{-1}X_n^{p}=X_n^{p}A^{-1} \end{array}$. As a consequence, we have

R(Xn)=I(IR(Xn1))(I+1pR(Xn1)+p+12p2R(Xn1)2)p.$$\begin{array}{} \displaystyle R({X_n}) = I - (I - R({X_{n - 1}})){\left( {I + \frac{1}{p}R({X_{n - 1}}) + \frac{{p + 1}}{{2{p^2}}}R{{({X_{n - 1}})}^2}} \right)^p}. \end{array}$$

Next, taking into account Villareal’s formula,

P(y)p=(i=0maiyi)p=i=0mpPiyi,$$\begin{array}{} \displaystyle P(y)^p=\left(\sum_{i=0}^m a_{i}y^{i}\right)^p=\sum_{i=0}^{mp}P_{i}y^{i}, \end{array}$$

where P0=a0p$\begin{array}{} \displaystyle P_{0}=a_{0}^p \end{array}$, Pi=j=0i1Pjaija0(ij)(p+1)ii$\begin{array}{} \displaystyle {P_{i}=\sum_{j=0}^{i-1}P_{j}\,\frac{a_{i-j}}{a_{0}}\,\frac{(i-j)(p+1)-i}{i}} \end{array}$, for i = 1,2,...,mp, and ai = 0, for all im + 1, it follows that

R(Xn)=I(IR(Xn1))i=02pPiR(Xn1)i,$$\begin{array}{} \displaystyle R(X_n) = I-(I-R(X_{n-1}))\sum_{i=0}^{2p} P_{i}R(X_{n-1})^{i}, \end{array}$$

where a0 = 1, a1=1p$\begin{array}{} \displaystyle a_{1}=\dfrac{1}{p} \end{array}$, a2=p+12p2$\begin{array}{} \displaystyle {a_2} = \frac{{p + 1}}{{2{p^2}}} \displaystyle \end{array}$, P0 = 1 and Pi=j=0i1Pjaija0(ij)(p+1)ii$\begin{array}{} \displaystyle \displaystyle{P_{i}=\sum_{j=0}^{i-1}P_{j}\,\frac{a_{i-j}}{a_{0}}\,\frac{(i-j)(p+1)-i}{i}} \end{array}$, for i = 1,2,...,2p, and ai = 0, for all i ≥ 3. In addition, we have P1 = P2 = 1 and P3 < 1.

Note that we can always obtain XnA = AXn. For this, it is enough to choose X0 = Ir×r, as Guo does in [11]. Now, we establish the semilocal convergence of Chebyshev’s method in the following theorem.

Theorem 9

LetAr×r$\begin{array}{} \displaystyle A\in\mathbb{C}^{r \times r} \end{array}$and X0 ε Θ such that AX0 = X0A andR(X0)=IA1X0p<1$\begin{array}{} \displaystyle \|R(X_0)\|=\|I-A^{-1}X_0^p\|<1 \end{array}$. Suppose that{Pi}i=32p$\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$is a nonincreasing sequence such that Pi are nonnegative for all i = 2,3,...,2p. Then, Chebyshev’s method defined in (20) converges toA1p$\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$. Moreover, ||R(Xn)|| ≤ ||R(X0)||3n, n$\begin{array}{} \displaystyle n\in\mathbb{N} \end{array}$.

In addition, we observe in table 2 some representative values of p that help to see that the conditions of Theorem 9 are satisfied.

Sequence {Pi}i=32p$\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$ for p = 2,3,5,7

p{Pi}i=32p$\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$
20.375, 0.1406...
30.4814..., 0.2222..., 0.0493..., 0.0109...
50.56, 0.296, 0.1059..., 0.0355..., 0.0080..., 0.0017..., 2.0736... × 10−4, 2.4883... × 10−5
70.5918..., 0.3294..., 0.1345..., 0.0512..., 0.0151..., 0.0041..., 0.0008..., 0.0001..., 2.6281... × 10−5, 3.6251... × 10−6, 2.9592... × 10−7, 2.4157... × 10−8

From Theorem 9, we also deduce that Chebyshev’s method has R-order of convergence at least three.

In [6], we present a family of high-order iterative methods including both Newton and Chebyshev methods. We find algorithms of the family with better numerical behavior than Newton and Halley methods. These two algorithms are basically the iterative methods proposed in the literature to solve this problem.

Conclusions

In this paper, we have reviewed some important properties of the classical Chebyshev method. We have point out the possibility to prove its global convergence from its geometric interpretation. We have mentioned the richer dynamics of this scheme in comparison with Newton’s method. Finally, we have presented several applications where this high order method is a good alternative to Newton’s method. These applications include the approximation of quadratic equations, the approximation of the inverse of a matrix or the pth root of a matrix.

eISSN:
2444-8656
Idioma:
Inglés
Calendario de la edición:
Volume Open
Temas de la revista:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics