After notes on Chebyshev’s iterative method

One of the most classical problems in Numerical Analysis is the approximation of the zeros of a given function f, that is, finding the values x^* for which f(x^*) = 0.

In order to approximate these equations we can use iterative methods. The most used scheme is the second order Newton method,

$x_{n + 1} = x_{n} - \frac{f (x_{n})}{f^{'} (x_{n})} .$ $$\begin{array}{} \displaystyle x_{n+1}=x_{n}-\frac{f(x_{n})} {f^{\prime} (x_{n})}. \end{array}$$(1)

This paper is devoted to the analysis of Chebyshev’s method that is a third order extension of Newton’s method. We present the geometric interpretation of the method and its global convergence. We introduce its extension to Banach spaces and review some applications where this high order method is a good alternative to Newton’s method.

Chebyshev is one of the famous mathematicians of the nineteenth century, creator of several mathematical schools in Russia: number theory, probability theory, function approximation theory, theory of mechanisms and machines, etc. He received primary education at home. His mother, taught him to read and write, while his cousin taught arithmetic and the French language, which will be very useful in his relation with Europe. He followed also completing his secondary education at home, but having as tutor in mathematics Prof. Pogorelsky, known in his day as the best teacher of elementary mathematics in Moscow. Prof. Brashman was who practically ran the university studies of Chebyshev ended in 1841. The department of physics and mathematics in which Chebyshev studied convened an award in the 1840-41 course. Chebyshev submitted a paper on the calculation of roots of equations, using the series expansion of the inverse function. The work, unpublished at the time, was rewarded with the silver medal. Chebyshev worked as a professor at the University of St. Petersburg for 35 years. He is recognized as the founder of the mathematical school in St. Petersburg whose echo and influence has reached our time in many branches of mathematics. This school was distinguished by the tendency to relate the theoretical problems of mathematics with problems in the art and nature.

Geometry, dynamics and convergence of Chebyshev’s method in the scalar case

The geometric interpretation of Newton’s method is well known, given an iterate x_n, the next iterate is the zero of the tangent line

$y (x) - f (x_{n}) = f^{'} (x_{n}) (x - x_{n}),$ $$\begin{array}{} \displaystyle y(x)-f(x_n)=f'(x_n)(x-x_n), \end{array}$$

to the graph of f at (x_n, f(x_n)).

The following well-known theorem, giving enough conditions for the global convergence of Newton’s method, follows easily from its geometric interpretation.

Theorem 1

Let f″ be continuous on an interval J, containing a root x^*of f ; let f′ ≠ 0 and f″ ≥ 0 or f″ ≤ 0 on J. Then Newton’s method converges monotonically to x^*from any point x₀ ε J such that f(x₀) f″(x₀) ≥ 0.

Chebyshev’s method is obtained by quadratic interpolation of the inverse function of f, in order to approximate f⁻¹(0), [22]. But it also admits a geometric derivation, from a parabola in the form

$a y {(x)}^{2} + y (x) + b x + c = 0,$ $$\begin{array}{} \displaystyle a y(x)^2+y(x)+bx+c=0, \end{array}$$(2)

that after the imposition of the super tangency conditions, y(x_n) = f(x_n), y′(x_n) = f′ (x_n) and y″(x_n) = f″(x_n), can be written as

$- \frac{f^{″} (x_{n})}{2 f^{'} {(x_{n})}^{2}} {(y (x) - f (x_{n}))}^{2} + y (x) - f (x_{n}) - f^{'} (x_{n}) (x - x_{n}) = 0.$ $$\begin{array}{} \displaystyle -\frac{f''(x_n)}{2f'(x_n)^2}(y(x)-f(x_n))^2+y(x)-f(x_n)-f'(x_n) (x-x_n)=0. \end{array}$$

By calculating the intersection of this parabola with the OX-axis we obtain the next step of Chebyshev’s method:

$x_{n + 1} = x_{n} - (1 + \frac{1}{2} L_{f} (x_{n})) \frac{f (x_{n})}{f^{'} (x_{n})},$ $$\begin{array}{} \displaystyle x_{n+1}=x_n- \left(1+{1\over 2} L_f(x_n)\right) {f(x_n)\over f'(x_n)}, \end{array}$$

where $L_{f} (x_{n}) = \frac{f (x_{n}) f^{″} (x_{n})}{{(f^{'} (x_{n}))}^{2}}$ $\begin{array}{} \displaystyle L_f(x_n)=\frac{f(x_n) f''(x_n)}{(f'(x_n))^2} \end{array}$.

We refer to [1] for the geometric interpretation of other third order methods.

By using the geometric interpretation of Chebyshev’s method, we can obtain the following global convergence theorem.

Theorem 2

Let f″′ be continuous on an interval J containing a root x^*of f, let f′ ≠ 0, L_f(x) > −2 and ${({(\frac{η}{f' (x)})}^{2})}^{''} \geq 0$ $\begin{array}{} \displaystyle {({(\frac{\eta }{{f'(x)}})^2})^{''}} \ge 0 \end{array}$in J, with η = sgn(f′). Then Chebyshev’s method converges monotonically to x^*from any point of the interval.

Proof. We suppose f′ > 0 (for f′ < 0 the proof is similar).

First, we begin from a point on the left of x^*, $\bar{x} \leq x^{*}$ $\begin{array}{} \displaystyle \overline{x} \leq x^* \end{array}$.

We would like to show that the intersection $\hat{x}$ $\begin{array}{} \displaystyle \widehat{x} \end{array}$ of the parabola y(x) given in (2) with the OX-axis will be in $[\bar{x}, x^{*}]$ $\begin{array}{} \displaystyle [\overline{x},x^*] \end{array}$. By hypothesis $L_{f} (\bar{x}) > - 2$ $\begin{array}{} \displaystyle L_f(\overline{x} )>-2 \end{array}$, in particular, $\bar{x} \leq \hat{x}$ $\begin{array}{} \displaystyle \overline{x} \leq \widehat{x} \end{array}$.

Thus, it will be enough, if for $x \geq \bar{x}$ $\begin{array}{} \displaystyle x\geq \overline{x} \end{array}$, we can prove that

$y (x) = \frac{- 1 + \sqrt{1 - 4 a (b x + c)}}{2 a} \geq f (x) .$ $$\begin{array}{} \displaystyle y(x)=\frac{-1+\sqrt{1-4a(bx+c)}}{2a} \geq f\left (x\right). \end{array}$$(3)

In this case, we will obtain a monotonic increasing sequence, bounded from above by x^*, then it converges at the limit γ ≥ x^*. So, because the construction of the method and the continuity of f the convergence is obtained, γ = x^*.

Inequality (3) is equivalent to

$\frac{- 1 + \sqrt{1 - 4 a (b x + c)}}{2 a} - \frac{- 1 + \sqrt{1 - 4 a (b \bar{x} + c)}}{2 a} \geq f (x) - f (\bar{x}),$ $$\begin{array}{} \displaystyle \frac{-1+\sqrt{1-4a(bx+c)}}{2a}- \frac{-1+\sqrt{1-4a(b\overline{x}+c)}}{2a} \geq f\left(x\right) -f\left (\overline{x}\right), \end{array}$$

$\int_{\bar{x}}^{x} \frac{- b}{\sqrt{1 - 4 a (b t + c)}} d t \geq \int_{\bar{x}}^{x} f^{'} (t) d t .$ $$\begin{array}{} \displaystyle \int\nolimits_{\overline{x}} ^ {x} \frac{-b}{\sqrt{1-4a(bt+c)}} dt\geq \int\nolimits_{\overline{x}} ^ {x}f^{\prime} \left (t\right) dt. \end{array}$$(4)

As f′ > 0 then for hypothesis ${({(\frac{1}{f'})}^{2})}^{''} \geq 0$ $\begin{array}{} \displaystyle {\left( {{{\left( {\frac{1}{{f'}}} \right)}^2}} \right)^{''}} \ge 0 \end{array}$ in J, i.e., ${(\frac{1}{f'})}^{2}$ $\begin{array}{} {\left( {\frac{1}{{f'}}} \right)^2} \end{array}$ is convex, and therefore

${(\frac{1}{f' (x)})}^{2} \geq \frac{1 - 4 a (b x + c)}{{(- b)}^{2}},$ $$\begin{array}{} \displaystyle {\left( {\frac{1}{{f'\left( x \right)}}} \right)^2} \ge \frac{{1 - 4a(bx + c)}}{{{{( - b)}^2}}}, \end{array}$$

because $\frac{1 - 4 a (b x + c)}{{(- b)}^{2}}$ $\begin{array}{} \displaystyle \frac{{1 - 4a(bx + c)}}{{{{( - b)}^2}}} \end{array}$ approximates ${(\frac{1}{f' (x)})}^{2}$ $\begin{array}{} {\left( {\frac{1}{{f'\left( x \right)}}} \right)^2} \end{array}$ up to second order.

Thus,

$\frac{- b}{\sqrt{1 - 4 a (b x + c)}} \geq f^{'} (x) > 0,$ $$\begin{array}{} \displaystyle \frac{-b}{\sqrt{1-4a(bx+c)}} \geq f^{'}(x) >0, \end{array}$$

and consequently the relation (4) holds.

Finally, if we begin from a point on the right of the root, we will obtain $\frac{- 1 + \sqrt{1 - 4 a (b x + c)}}{2 a} \leq f (x)$ $\begin{array}{} \displaystyle \frac{-1+\sqrt{1-4a(bx+c)}}{2a} \leq f\left (x\right) \end{array}$ and the convergence will be monotonic from the right.

We refer to [3] for the global convergence of other third order schemes and some comparisons.

In general, the method has not global convergence. We only are able in these cases to ask for local or semilocal convergence. Around the solutions we looking for regions of convergence.

Consider for instance the problem to find the zeros of a polynomial, p(z) = 0. Let $R (z) = \frac{P (z)}{Q (z)}$ $\begin{array}{} R(z) = \frac{{P(z)}}{{Q(z)}} \end{array}$, where P(z) and Q(z) are complex polynomials with no common factors, be a rational map on the Riemann sphere. We say that z₀ is a fixed point of R(z) if R(z₀) = z₀. For $z \in \bar{ℂ}$ $\begin{array}{} \displaystyle \,z \in \overline{\mathbb{C}}\, \end{array}$ we define its orbit as the set orb(z) = {z,R(z),R²(z),...,R^k(z),...}, where R^k means the k –fold iterate of R. A periodic point of period n is a point z₀ such that Rⁿ(z₀) = z₀ and R^j(z₀) ≠ z₀ for 0 < j < n. Observe that if $z \in \bar{ℂ}$ $\begin{array}{} \displaystyle \,z \in \overline{\mathbb{C}}\, \end{array}$ is a periodic point of period n ≥ 1, then z₀ is a fixed point of Rⁿ . Also, recall that a fixed point z₀ is respectively attracting, repelling or indifferent in case |R′(z₀)| is less than, greater than or equal to 1. A periodic point of period n is said to be attracting, repelling or indifferent if as a fixed point of Rⁿ(z) is respectively attracting, repelling or indifferent. A superattracting fixed point of R(z) is a fixed point which is also a zero of the derivative R′(z). A periodic point of period n is said to be a superattracting periodic point of R(z) if, as a fixed point of Rⁿ(z), is superattracting.

Let ζ be an attracting fixed point of R(z). The basin of attraction of ζ is the set $B (ζ) = {z \in \bar{ℂ} : R^{n} (z) \to ζ$ $\begin{array}{} B\left( \zeta \right) = \{ z \in \overline {\Bbb C} {\text{ : }}{R^n}\left( z \right) \to \zeta \} \end{array}$ as n → ∞}. The immediate basin of attraction of an attracting fixed point ζ of R(z), denoted by B^*(ζ), is the connected component of B(ζ) containing ζ . Finally, if z₀ is an attracting periodic point of period n of R(z), the basin of attraction of the orbit orb(z₀) is the set $B (orb (z_{0})) = \cup_{j = 0}^{n - 1} R^{j} (B (z_{0}))$ $\begin{array}{} \displaystyle {\mkern 1mu} B({\rm{orb}}({z_0})) = \cup _{j = 0}^{n - 1}{R^j}(B({z_0})){\mkern 1mu} \end{array}$, where B(z₀) is the attraction basin of z₀ as a fixed point of Rⁿ . The Julia set of a rational map R(z), denoted by $J (R)$ $\begin{array}{} \displaystyle {\cal J}(R) \end{array}$, is the closure of the set of repelling periodic points. Its complement is the Fatou set $ℱ (R)$ $\begin{array}{} \displaystyle {\cal F}(R) \end{array}$. If R(z) has an attracting fixed point z₀, then the basin of attraction B(z₀) is contained in the Fatou set and $J (R) = \partial B (z_{0})$ $\begin{array}{} \displaystyle J(R) = \partial B({z_0}){\mkern 1mu} \end{array}$. Therefore, the chaotic dynamics of R(z) is contained in its Julia set.

The iterative rational function for Chebyshev’s method is given by

$C h_{p} (z) = z - (1 + \frac{1}{2} L_{p} (z)) \frac{p (z)}{p^{'} (z)} .$ $$\begin{array}{} \displaystyle Ch_p(z)=z-\left( 1+ \frac{1}{2}\, L_p(z)\right)\frac{p(z)}{p'(z)}. \end{array}$$(5)

We recall the definition of conjugacy.

Definition 1

Let R₁,R₂ : $\bar{ℂ} \to \bar{ℂ}$ $\overline{\Bbb{C}}\to \overline{\Bbb{C}}$ be two rational maps. We say that R₁ and R₂ are conjugated if there exists a Möbius transformation φ : $\bar{ℂ} \to \bar{ℂ}$ $\overline{\Bbb{C}}\to \overline{\Bbb{C}}$ such that R₂ ∘ φ(z) = φ ∘ R₁(z) for all z.

An important feature of conjugation of rational maps is given by the following classical result.

Theorem 3

Let R₁and R₂be two rational maps and let φ be a Möbius transformation conjugating R₁and R₂, that is, R₂ = φ ∘ R₁ ∘ φ⁻¹ . Then $F (R_{2}) = Φ (F (R_{1}))$ $\begin{array}{} \displaystyle {\cal F}({R_2}) = \phi ({\cal F}({R_1})) \end{array}$and $J (R_{2}) = Φ (J (R_{1}))$ $\begin{array}{} \displaystyle {\cal J}({R_2}) = \phi ({\cal J}({R_1})) \end{array}$.

Conjugacy plays a central rôle in understanding the behavior of classes of maps from the point of view of dynamical systems in the following sense. Suppose that one wishes to describe the quantitative as well as the qualitative behavior of the map z → Φ_f (z), where Φ_f (z) is some iterative function. Since conjugacy preserves fixed and periodic points and their type as well as attraction basins, the dynamical data concerning f is carried by the fixed points of Φ_f (z) as well as by the nature of such fixed points which may be (super)attracting, repelling or indifferent. Therefore, it is worthwhile to build up, for polynomials of degree two and three, a parametrized family consisting of polynomials as simple as possible such that a conjugacy exists between the corresponding iterative functions.

The result that follows, due to A. Cayley [8,21], has great historical importance. In an attempt to understand the dynamics of Newton’s method in the complex plane, Cayley investigated the dynamics of Newton’s method applied to polynomials of a particularly simple form. He realized that major difficulties would arise when attempting to extend the following result for quadratics to cubics and beyond. It is believed that this circumstance motivated further work of P. Fatou and G. Julia along these lines.

Theorem 4

Let $N_{p} (z) = \frac{z^{2} - a b}{2 z - (b + a)}$ $\begin{array}{} \displaystyle \,N_p(z)={\displaystyle \frac { z^{2} - a\,b}{ 2\,z -( b + a)}}\, \end{array}$be the rational map obtained from Newton’s method applied to a generic quadratic polynomial p(z) = (z − a)(z − b). Then N_p is conjugated to the map z → z²by the Möbius transformation $M (z) = \frac{z - a}{z - b}$ $\begin{array}{} \displaystyle \,M(z)=\frac{z-a}{z-b}\, \end{array}$and $J (N_{p})$ $\begin{array}{} \displaystyle {\cal J}({N_p}) \end{array}$is the straight line in the complex plane corresponding to the locus of points equidistant from a and b.

For Chebyshev’s method, also known as super–Newton method, the following result holds.

Theorem 5

$C h_{p} (z) = \frac{3 z^{4} - 2 (a + b) z^{3} - 6 a b z^{2} + 6 a b (a + b) z - a b (a^{2} + 3 a b + b^{2})}{{(2 z - a - b)}^{3}}$ $$\begin{array}{} \displaystyle Ch_p(z)=\displaystyle \frac { 3z^4-2(a+b)z^3-6abz^2+6ab(a+b)z-ab(a^2+3ab+b^2)}{(2z-a-b)^3} \end{array}$$

be the rational map obtained from Chebyshev’s method applied to a generic quadratic polynomial p(z) = (z − a)(z − b). Then Ch_p(z) is conjugated to the map $S_{3} (z) = \frac{z^{4} + 2 z^{3}}{2 z + 1}$ $\begin{array}{} \displaystyle \, S_3(z)=\frac{z^4+2z^3}{2z+1}\, \end{array}$via the Möbius transformation $M (z) = \frac{z - a}{z - b}$ $\begin{array}{} \displaystyle \,M(z)=\frac{z-a}{z-b}\, \end{array}$.

For cubic polynomials we have the following result.

Theorem 6

Let p(z) = (z − z₀)(z − z₁)(z − z₂) be a generic cubic polynomial with roots ordered as follows: 0 ≤ |z₀| ≤ |z₁| ≤ |z₂|. Let T (z) = (z₂ − z₀)z + z₀. Then

p(z) reduces to a polynomial belonging to the parametrized family q_λ,ρ(z) = p ∘ T (z) = λ³z(z − 1)(z − ρ) , where λ = z₂ − z₀and $ρ = \frac{z_{1} - z_{0}}{z_{2} - z_{0}}$ $\begin{array}{} \displaystyle \,\rho=\frac{z_1-z_0}{z_2-z_0} \end{array}$,

T is a conjugacy between Ch_p and $C h_{q_{λ, ρ}}$ $\begin{array}{} \displaystyle \,Ch_{q_{\lambda,\rho}}\, \end{array}$, that is, $T^{- 1} ° C h_{q_{λ, ρ}} ° T = C h_{p}$ $\begin{array}{} \displaystyle \,T^{-1}\circ Ch_{q_{\lambda,\rho}}\circ T= Ch_{p}\, \end{array}$, and

if ${\tilde{q}}_{ρ} (z) = z (z - 1) (z - ρ)$ $\begin{array}{} \displaystyle \,\tilde{q}_{\rho}(z)=z (z-1) (z-\rho)\, \end{array}$, then $C h_{q_{λ, ρ}}$ $\begin{array}{} \displaystyle \,Ch_{q_{\lambda,\rho}}\, \end{array}$is conjugated to $C h_{{\tilde{q}}_{ρ}}$ $\begin{array}{} \displaystyle \,Ch_{\tilde{q}_{\rho}}\, \end{array}$. Consequently, Ch_p is conjugated to $C h_{{\tilde{q}}_{ρ}}$ $\begin{array}{} \displaystyle \,Ch_{\tilde{q}_{\rho}}\, \end{array}$.

It is possible to see that the one–parameter family ${\tilde{q}}_{ρ} (z) = z (z - ρ) (z - 1)$ $\begin{array}{} \displaystyle \, \tilde{q}_\rho(z) =z (z-\rho) (z-1) \, \end{array}$ reduces to the well–known parameter family p_A(z) = z³ + (A − 1)z − A (see [10]).

In the two pictures that follow we show the attraction basins of Newton’s and Chebyshev’s iterations applied to z³ − 1 = 0. we can see a more rich dynamics in Chebyshev’s method.

Basins of attraction for the polynomial p(z) = z³ − 1. Newton’s method.

Basins of attraction for the polynomial p(z) = z³ − 1. Chebyshev’s method.

In [4], we study the dynamics of a family of third-order iterative algorithms which includes Chebyshev’s iteration function, Halley’s iterative method, super-Halley’s iterative method and the c -iterative methods. Using results of conjugation of rational maps and the definition of the universal Julia set we find the conjugacy classes of these iterative methods explicitly.

Extension to Banach spaces and some applications

To approximate a solution of the nonlinear equation

$F (x) = 0,$ $$\begin{array}{} \displaystyle F\left( x\right) =0, \end{array}$$(6)

F : X → Y, X, Y Banach spaces, Chebyshev’s method can be written as

$x_{n + 1} = x_{n} - (I + \frac{1}{2} L_{F} (x_{n})) F^{'} {(x_{n})}^{- 1} F (x_{n}),$ $$\begin{array}{} \displaystyle x_{n+1}=x_{n}-\left( I+\frac{1}{2}L_{F}\left(x_{n}\right) \right) F^{\prime}\left( x_{n}\right)^{-1} F\left( x_{n}\right), \end{array}$$(7)

where

$\begin{matrix} L_{F} (x_{n}) = F^{'} {(x_{n})}^{- 1} F^{''} (x_{n}) F^{'} {(x_{n})}^{- 1} F (x_{n}), \\ β \in [0, 1] . \end{matrix}$ $$\begin{array}{} \displaystyle \begin{array}{*{20}{l}} {{L_F}\left( {{x_n}} \right) = {F^\prime }{{\left( {{x_n}} \right)}^{ - 1}}{F^{\prime \prime }}\left( {{x_n}} \right){F^\prime }{{\left( {{x_n}} \right)}^{ - 1}}F\left( {{x_n}} \right),}\\ {{\rm{ }}\beta \in [0,1].} \end{array} \end{array}$$

Next, we review some applications where Chebyshev’s method can be considered a good alternative to Newton’s method.

3.1

Quadratic equations

In this case, F″(x) is a constant bilinear operator that we denote by B.

Using Taylor expansions,

$F (y_{n}) = F (x_{n}) + F' (x_{n}) (y_{n} - x_{n}) + \frac{1}{2} F'' (x_{n}) {(y_{n} - x_{n})}^{2} = \frac{1}{2} B {(y_{n} - x_{n})}^{2} .$ $$\begin{array}{} \displaystyle F(y_n)=F(x_n)+F^{'}(x_n)(y_n-x_n)+\frac{1}{2}F^{''}(x_n)(y_n-x_n)^2 =\frac{1}{2} B(y_n-x_n)^2. \end{array}$$(8)

Thus, $F' {(x_{n})}^{- 1} F (y_{n}) = - \frac{1}{2} L_{F} (x_{n}) (y_{n} - x_{n})$ $\begin{array}{} \displaystyle F^{'}(x_n)^{-1} F(y_n)=-\frac{1}{2}L_{F}(x_n)(y_n-x_n) \end{array}$, and the method becomes

$\begin{matrix} y_{n} & = & x_{n} - F' {(x_{n})}^{- 1} F (x_{n}), \\ x_{n + 1} & = & y_{n} - F' {(x_{n})}^{- 1} F (y_{n}), \end{matrix}$ $$\begin{array}{} \displaystyle \begin{array}{*{20}{r}} {{y_n}}& = &{{x_n} - F'{{({x_n})}^{ - 1}}F({x_n}),}\\ {{x_{n + 1}}}& = &{{y_n} - F'{{({x_n})}^{ - 1}}F({y_n}),} \end{array} \end{array}$$

equivalently,

$\begin{matrix} F' (x_{n}) (y_{n} - x_{n}) & = & - F (x_{n}), \\ F' (x_{n}) (x_{n + 1} - y_{n}) & = & - F (y_{n}) . \end{matrix}$ $$\begin{array}{} \displaystyle \begin{array}{*{20}{r}} {F'\left( {{x_n}} \right)\left( {{y_n} - {x_n}} \right)}& = &{ - F({x_n}),}\\ {F'\left( {{x_n}} \right)\left( {{x_{n + 1}} - {y_n}} \right)}& = &{ - F({y_n}).} \end{array} \end{array}$$(9)

Notice that we only need a LU decomposition and three evaluations (F(x_n), F(y_n) and F′(x_n)) in each iteration. In particular, for this problem Chebyshev’s method is more efficient than Newton’s method.

Moreover, we can obtain a very simple semilocal convergence:

Theorem 7

Given x₀such that there exists F′(x₀)⁻¹and condition

$| | F' {(x_{0})}^{- 1} B | | | | F' {(x_{0})}^{- 1} F (x_{0}) | | \leq \frac{1}{2}$ $$\begin{array}{} \displaystyle ||F{\rm{'}}{({x_0})^{ - 1}}B||\;||F{\rm{'}}{({x_0})^{ - 1}}F({x_0})|| \le \frac{1}{2} \end{array}$$(10)

holds, then Chebyshev’s method is well defined and converges to x^*, solution of F(x) = 0.

We refer [12] and its references for a general convergence analysis of this type of methods.

Let us consider the equation

$F (x) = x^{T} B x + C x + D = 0,$ $$\begin{array}{} \displaystyle \label{ricatti} F\left( x \right)=x^T B x + C x + D = 0, \end{array}$$(11)

where

$\begin{matrix} B & = & r a n d (N * N * N), \\ C & = & r a n d (N * N), \end{matrix}$ $$\begin{array}{} \displaystyle \begin{array}{*{20}{l}} B& = &{rand\left( {N*N*N} \right),}\\ C& = &{rand\left( {N*N} \right),} \end{array} \end{array}$$

and D is compute in order to obtain $x_{i}^{*} = 1$ $\begin{array}{} \displaystyle x_i^* = 1 \end{array}$, i = 1,...,N, as solution.

In table 1 we consider N = 30. The Chebyshev’s method has third order of convergence.

Table 1

Error max-norm, x₀ such that ||x^* − x₀|| = 0.1

Iteration	Cheby.
1	8.41e − 03
2	1.78e − 09
3	2.90e − 13
4	0.00e + 00

For quadratic equations we can find efficient higher order methods [2].

3.2

The inverse of a matrix

Approximating inverse operators is a very common task in several areas of interest, such as physics, chemistry, engineering, etc. In a general context, we can formulate the following problem: given g, we are interested in calculating f ε Ω such that H(f) = g, where H : Ω → Y is an operator defined in a domain Ω of a Banach space X with values in a Banach space Y . It is clear that we have to calculate or approximate the inverse operator H⁻¹ for solving the previous equation. In this case, if g is in the domain of H⁻¹, there is a solution f = H⁻¹(g).

To approximate the inverse operator H⁻¹, we use Newton-type methods, so that better successive approximations to H⁻¹ are constructed from an initial approximation. The methods considered in this paper, when they are applied to compute an inverse operator, are not be based on solving linear systems, if not on the product of matrices. Notice that the formulation of the problem in this manner is very interesting.

Let X and Y be two Banach spaces and $G L (X, Y) = {H \in ℒ (X, Y) : H^{- 1} exists}$ $\begin{array}{} \displaystyle GL(X,Y)=\{H\in\mathcal{L}(X,Y):\, H^{-1}\,\text{exists}\} \end{array}$, where $ℒ (X, Y)$ $\begin{array}{} \displaystyle \mathcal{L}(X,Y) \end{array}$ is the set of bounded linear operators from the Banach space X into the Banach space Y . The problem that we are thinking about is the following: given an operator H ε GL(X,Y), we approximate H⁻¹. To do this, we first consider

$ℱ : G L (Y, X) \to ℒ (X, Y) and ℱ (G) = G^{- 1} - H,$ $$\begin{array}{} \displaystyle \mathcal{F}:GL(Y,X)\rightarrow\mathcal{L}(X,Y) \quad\text{and}\quad \mathcal{F}(G)=G^{-1}-H, \end{array}$$

so that H⁻¹ is the solution of the equation $ℱ (G) = 0$ $\begin{array}{} \displaystyle \mathcal{F}(G)=0 \end{array}$.

If we observe Newton’s method,

${\begin{matrix} G_{0} given, \\ G_{n + 1} = G_{n} - {[ℱ^{'} (G_{n})]}^{- 1} ℱ (G_{n}), n \geq 0, \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = {G_n} - {{[{\cal F}'({G_n})]}^{ - 1}}{\cal F}({G_n}),\quad n \ge 0,} \end{array}\right. \end{array}$$

it is clear that inverse operators are used, but if we take into account the definition of $ℱ$ $\begin{array}{} \displaystyle \mathcal{F} \end{array}$, calculate $ℱ^{'} (G_{n})$ $\begin{array}{} \displaystyle \mathcal{F}'(G_n) \end{array}$ and do

$ℱ^{'} (G_{n}) (G_{n + 1} - G_{n}) = - ℱ (G_{n}), n \geq 0,$ $$\begin{array}{} \displaystyle \mathcal{F}'(G_n)(G_{n+1}-G_n)=-\mathcal{F}(G_n),\quad n\geq 0, \end{array}$$(12)

then we can avoid the use of inverse operators for approximating G_n+1.

Indeed, to obtain the corresponding algorithm, we only need to compute $ℱ^{'} (G_{n})$ $\begin{array}{} \displaystyle \mathcal{F}'(G_n) \end{array}$. So, given G ε GL(Y,X), as G⁻¹ exists, if

$0 < ε < \frac{1}{‖ α ‖ ‖ G^{- 1} ‖},$ $$\begin{array}{} \displaystyle 0 < \varepsilon < \dfrac{1}{\|\alpha\|\|G^{-1}\|}, \end{array}$$

we have $‖ ε α ‖ < \frac{1}{‖ G^{- 1} ‖}$ $\begin{array}{} \displaystyle \|\varepsilon\alpha\| < \frac{1}{\|G^{-1}\|} \end{array}$ for α ε GL(Y,X). Therefore, it is known that G + εα ε GL(Y,X) and then

$ℱ^{'} (G) α = \lim_{ε \to 0} \frac{1}{ε} [ℱ (G + ε α) - ℱ (G)] = - G^{- 1} α G^{- 1} .$ $$\begin{array}{} \displaystyle \mathcal{F}'(G)\alpha= \lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon} [\mathcal{F}(G+\varepsilon\alpha)-\mathcal{F}(G)]= -G^{-1}\alpha G^{-1}. \end{array}$$

In consequence, Newton’s method is now given by the following algorithm:

${\begin{matrix} G_{0} given, \\ G_{n + 1} = 2 G_{n} - G_{n} H G_{n}, n \geq 0. \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = 2{G_n} - {G_n}H{G_n},\quad n \ge 0.} \end{array}\right. \end{array}$$(13)

In addition, Newton’s method does not use inverse operators for approximating an inverse operator and the order of convergence is two.

If we now consider Chebyshev’s method,

${\begin{matrix} G_{0} given, \\ G_{n + 1} = G_{n} - [I + \frac{1}{2} L_{ℱ} (G_{n})] {[ℱ^{'} (G_{n})]}^{- 1} ℱ (G_{n}), n \geq 0, \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = {G_n} - \left[ {I + \frac{1}{2}{L_{\cal F}}({G_n})} \right]{{[{\cal F}'({G_n})]}^{ - 1}}{\cal F}({G_n}),\quad n \ge 0,} \end{array}\right. \end{array}$$

where $L_{ℱ} (G_{n}) = {[ℱ^{'} (G_{n})]}^{- 1} ℱ^{″} (G_{n}) {[ℱ^{'} (G_{n})]}^{- 1} ℱ (G_{n})$ $\begin{array}{} \displaystyle L_{\mathcal{F}}(G_n)=[\mathcal{F}'(G_n)]^{-1}\mathcal{F}''(G_n)[\mathcal{F}'(G_n)]^{-1}\mathcal{F}(G_n) \end{array}$, we can think that inverse operators are used, but we can do the same as in Newton’s method to see that Chebyshev’s method does not use them:

${\begin{matrix} ℱ^{'} (G_{n}) (P_{n} - G_{n}) = - ℱ (G_{n}), n \geq 0, \\ ℱ^{'} (G_{n}) (G_{n + 1} - P_{n}) = - \frac{1}{2} ℱ^{″} (G_{n}) {(P_{n} - G_{n})}^{2}, \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{\cal F}'({G_n})({P_n} - {G_n}) = - {\cal F}({G_n}),\quad n \ge 0,}\\ {{\cal F}'({G_n})({G_{n + 1}} - {P_n}) = - \frac{1}{2}{\cal F}''({G_n}){{({P_n} - {G_n})}^2},} \end{array}\right. \end{array}$$(14)

so that we can also avoid the use of inverse operators for approximating G_n+1.

Let α,β ε GL(Y,X) and then

$0 < ε < \frac{1}{‖ β ‖ ‖ G^{- 1} ‖},$ $$\begin{array}{} \displaystyle 0 < \varepsilon < \dfrac{1}{\|\beta\|\|G^{-1}\|}, \end{array}$$

so that G + εβ ε GL(Y,X) and

$ℱ^{″} (G) α β = \lim_{ε \to 0} \frac{1}{ε} [ℱ^{'} (G + ε β) α - ℱ^{'} (G) α] = G^{- 1} α G^{- 1} β G^{- 1} + G^{- 1} β G^{- 1} α G^{- 1} .$ $$\begin{array}{} \displaystyle \mathcal{F}''(G)\alpha\beta= \lim_{\varepsilon\rightarrow 0}\frac{1}{\varepsilon} [\mathcal{F}'(G+\varepsilon\beta)\alpha-\mathcal{F}'(G)\alpha]= G^{-1}\alpha G^{-1}\beta G^{-1} + G^{-1}\beta G^{-1}\alpha G^{-1}. \end{array}$$

In consequence, we write Chebyshev’s method as

${\begin{matrix} G_{0} given, \\ G_{n + 1} = 3 G_{n} - 3 G_{n} H G_{n} + G_{n} H G_{n} H G_{n}, n \geq 0. \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{G_0}\;{\rm{given}},}\\ {{G_{n + 1}} = 3{G_n} - 3{G_n}H{G_n} + {G_n}H{G_n}H{G_n},\quad n \ge 0.} \end{array}\right. \end{array}$$(15)

Observe that Chebyshev’s method does not use inverse operators for approximating an inverse operator and the order of convergence is three.

Theorem 8

If ||I − HG₀|| < 1, Newton and Chebyshev iterative methods are convergent. Moreover, if HG₀ = G₀H, then lim_n→∞G_n = H⁻¹.

On the other hand, if we observe Newton’s and Chebyshev’s methods, two well-known iterative methods, we conclude that they can approximate inverse operators without using any inverse operator in their application. From this feature of both methods, we pay attention in [5] to the construction of iterative methods of any prefixed order of convergence. Moreover, the final formulation of the methods uses only potencies of matrices that are close to the identity. This fact is crucial to avoid any stability problem in the implementation of the methods. The best method of the family will depend on the particular problem to solve. In any case, we find iterative methods with better behavior than Newton’s method. Finally, we would like to emphasize that in the applications, where we are interested in the computation of an inverse operator, our methods use matrix-matrix multiplications with a computational cost similar to that of Gauss-type methods. But, if we are interested only in the application of the inverse operator, we will be able to implement our methods using only matrix-vector multiplications, so that we reduce considerably the computational cost.

3.3

The pth root of a matrix

If we consider the complex equation f(x) = x^p − a with $a \in ℂ$ $\begin{array}{} \displaystyle a\in\mathbb{C} \end{array}$, then $L_{f} (x) = (\frac{p - 1}{p}) \frac{x^{p} - a}{x^{p}}$ $\begin{array}{} \displaystyle L_{f}(x)=\left(\dfrac{p-1}{p}\right)\dfrac{x^p-a}{x^p} \end{array}$ and Chebyshev’s method is reduced to

$x_{0} \in D, x_{n + 1} = \frac{2 p^{2} - 3 p + 1}{2 p^{2}} x_{n} + \frac{2 p - 1}{p^{2}} a x_{n}^{1 - p} - \frac{p - 1}{2 p^{2}} a^{2} x_{n}^{1 - 2 p}, n \geq 0.$ $$\begin{array}{} \displaystyle {x_0} \in D,{\rm{ }}{x_{n + 1}} = \frac{{2{p^2} - 3p + 1}}{{2{p^2}}}{\mkern 1mu} {x_n} + \frac{{2p - 1}}{{{p^2}}}{\mkern 1mu} a{\mkern 1mu} x_n^{1 - p} - \frac{{p - 1}}{{2{p^2}}}{\mkern 1mu} {a^2}{\mkern 1mu} x_n^{1 - 2p},\quad n \ge 0. \end{array}$$

If we extend Chebyshev’s method for the computation of the pth root of a matrix, we then consider the space

$Θ = {A \in ℂ^{r \times r} s .t . A has no nonpositive real eigenvalues}$ $$\begin{array}{} \displaystyle \Theta = \left\{A \in \mathbb{C}^{r\times r}\,\text{ s.t. } A \text{ has no nonpositive real eigenvalues} \right\} \end{array}$$

and approximate $A^{\frac{1}{p}}$ $\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$ for a given matrix X ε Θ. For this, we consider

$ℱ : Θ \to ℂ^{r \times r} and ℱ (X) = X^{p} - A,$ $$\begin{array}{} \displaystyle \mathcal{F}:\Theta\rightarrow\mathbb{C}^{r \times r} \quad\text{and}\quad \mathcal{F}(X)=X^{p}-A, \end{array}$$

so that $A^{\frac{1}{p}}$ $\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$ is the solution of the equation $ℱ (X) = 0$ $\begin{array}{} \displaystyle \mathcal{F}(X)=0 \end{array}$. So, Chebyshev’s method is reduced to

$X_{0} \in Θ, X_{n + 1} = \frac{2 p^{2} - 3 p + 1}{2 p^{2}} X_{n} + \frac{2 p - 1}{p^{2}} A X_{n}^{1 - p} - \frac{p - 1}{2 p^{2}} A^{2} X_{n}^{1 - 2 p}, n \geq 0.$ $$\begin{array}{} \displaystyle X_0\in\Theta,\quad X_{n+1} = \dfrac{2p^2-3p+1}{2p^2}\,X_n + \dfrac{2p-1}{p^2}\, A X_n^{1-p} - \dfrac{p-1}{2p^2}\,A^2 X_n^{1-2p},\quad n\geq 0. \end{array}$$(16)

Chebyshev’s method, as Newton’s method, cannot be used directly to approximate the principal pth root. Using the same idea as in [15], one can prove that it is not stable in a neighborhood of $A^{\frac{1}{p}}$ $\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$. A small perturbation on the value of X_n is amplified in the following steps and in a finite arithmetic computation the algorithm diverges. This problem can be overcome using another algorithm which provides the same sequence but it is stable in a neighborhood of $A^{\frac{1}{p}}$ $\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$.

The following iteration, given by Denman and Beavers in [9],

${\begin{matrix} X_{0} = A, Y_{0} = I, \\ X_{n + 1} = \frac{1}{2} (X_{n} + Y_{n}^{- 1}), \\ N_{n + 1} = \frac{1}{2} (Y_{n} + X_{n}^{- 1}), n \geq 0. \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = A,\quad {Y_0} = I,}\\ {{X_{n + 1}} = \frac{1}{2}({X_n} + Y_n^{ - 1}),}\\ {{N_{n + 1}} = \frac{1}{2}({Y_n} + X_n^{ - 1}),\quad n \ge 0.} \end{array}\right. \end{array}$$

is a stable variant of the Newton method for the matrix square root.

The instability of the simplified Newton iterations $X_{n + 1} = (X_{n} + A X_{n}^{- 1}) / 2$ $\begin{array}{} \displaystyle X_{n+1}=(X_n+AX_n^{-1})/2 \end{array}$ and $X_{n + 1} = (X_{n} + X_{n}^{- 1} A) / 2$ $\begin{array}{} \displaystyle X_{n+1}=(X_n+X_n^{-1} A)/2 \end{array}$, shown by Higham [15] for the matrix square root, is mainly due to the one-sided multiplication by A. On the other hand, since X_n and A commute if AX₀ = X₀A, the iteration can be rewritten as

$X_{n + 1} = \frac{X_{n} + A^{\frac{1}{2}} X_{n}^{- 1} A^{\frac{1}{2}}}{2}, n \geq 0,$ $$\begin{array}{} \displaystyle X_{n+1}=\frac{X_n+ A^\frac{1}{2} X_n^{-1} A^\frac{1}{2}}{2}, \quad n\geq 0, \end{array}$$

and the iteration becomes stable in this form. However, it is useless since it involves the square root of A, but it helps us to stabilize the iteration by introducing the variable $Y_{n} = A^{\frac{1}{2}} X_{n}^{- 1} A^{\frac{1}{2}} = A^{- 1} X_{n} = X_{n} A^{- 1}$ $\begin{array}{} \displaystyle Y_n= A^\frac{1}{2} X_n^{-1} A^\frac{1}{2}=A^{-1}X_n=X_nA^{-1} \end{array}$. The resulting iteration is that of Denman and Beavers, see [16].

Following these ideas, Iannazo [17] proposes the following two stable versions of the Newton method for the matrix pth root:

$X_{n + 1} = \frac{(p - 1) X_{n} + {(A^{\frac{1}{p}} X_{n}^{- 1})}^{p - 1} A^{\frac{1}{p}}}{p}, n \geq 0,$ $$\begin{array}{} \displaystyle {X_{n + 1}} = \frac{{(p - 1){X_n} + {{\left( {{A^{\frac{1}{p}}}X_n^{ - 1}} \right)}^{p - 1}}{A^{\frac{1}{p}}}}}{p},{\rm{ }}n \ge 0, \end{array}$$

and

${\begin{matrix} X_{0} = I, N_{0} = A, \\ X_{n + 1} = X_{n} (\frac{(p - 1) I + N_{n}}{p}), \\ N_{n + 1} = {(\frac{(p - 1) I + N_{n}}{p})}^{- p} N_{n}, n \geq 0. \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = I,\quad {N_0} = A,}\\ {{X_{n + 1}} = {X_n}\left( {\frac{{(p - 1)I + {N_n}}}{p}} \right),}\\ {{N_{n + 1}} = {{\left( {\frac{{(p - 1)I + {N_n}}}{p}} \right)}^{ - p}}{N_n},\quad n \ge 0.} \end{array}\right. \end{array}$$

Observe that in the second case, the matrix A does not explicitly appear in the iteration.

Similarly, for Chebyshev’s method, we propose

$X_{n + 1} = \frac{(p - \frac{3}{2} + \frac{1}{2 p}) X_{n} + (2 - \frac{1}{p}) {(A^{\frac{1}{p}} X_{n}^{- 1})}^{p - 1} A^{\frac{1}{p}}}{p} - \frac{(p - 1) {(A^{\frac{1}{p}} X_{n}^{- 1})}^{2 p - 1} A^{\frac{1}{p}}}{2 p^{2}}, n \geq 0.$ $$\begin{array}{} \displaystyle X_{n+1}=\frac{\left(p-\frac{3}{2}+\frac{1}{2p}\right) X_n + \left(2-\frac{1}{p}\right) \left(A^{\frac{1}{p}} X_n^{-1} \right)^{p-1} A^{\frac{1}{p}}}{p} - \frac{ (p-1) \left(A^{\frac{1}{p}} X_n^{-1} \right)^{2p-1} A^{\frac{1}{p}}}{2 p^2},\quad n\geq 0. \end{array}$$(17)

and

${\begin{matrix} X_{0} = I, N_{0} = A, \\ X_{n + 1} = X_{n} (\frac{(p - \frac{3}{2} + \frac{1}{2 p}) I + (2 - \frac{1}{p}) N_{n}}{p} - \frac{(p - 1) N_{n}^{2}}{2 p^{2}}), n \geq 0, \\ N_{n + 1} = {(\frac{(p - \frac{3}{2} + \frac{1}{2 p}) I + (2 - \frac{1}{p}) N_{n}}{p} - \frac{(p - 1) N_{n}^{2}}{2 p^{2}})}^{- p} N_{n} . \end{matrix}$ $$\begin{array}{} \displaystyle \left\{ \begin{array}{*{20}{l}} {{X_0} = I,\quad {N_0} = A,}\\ {{X_{n + 1}} = {X_n}\left( {\frac{{\left( {p - \frac{3}{2} + \frac{1}{{2p}}} \right)I + \left( {2 - \frac{1}{p}} \right){N_n}}}{p} - \frac{{(p - 1)N_n^2}}{{2{p^2}}}} \right),\quad n \ge 0,}\\ {{N_{n + 1}} = {{\left( {\frac{{\left( {p - \frac{3}{2} + \frac{1}{{2p}}} \right)I + \left( {2 - \frac{1}{p}} \right){N_n}}}{p} - \frac{{(p - 1)N_n^2}}{{2{p^2}}}} \right)}^{ - p}}{N_n}.} \end{array}\right. \end{array}$$(18)

Note that (17) is useless, since it involves the matrix pth root of A.

Observe that we need to calculate $X_{n}^{- 1}$ $\begin{array}{} \displaystyle X_{n}^{-1} \end{array}$ at each step of Chebychev’s method given by (16). We are interesting in obtaining other expression of Chebyshev’s method which avoids the computation of these inverses. So, if we consider the complex function $f (x) = \frac{1}{x^{p}} - \frac{1}{a} = 0$ $\begin{array}{} \displaystyle f(x)=\frac{1}{x^{p}}-\frac{1}{a}=0 \end{array}$, where $f : D \subseteq ℂ \to ℂ$ $\begin{array}{} \displaystyle f: D\subseteq\mathbb{C}\rightarrow\mathbb{C} \end{array}$, then algorithm (17) is reduced to

$x_{0} \in D, x_{n + 1} = x_{n} (1 + \frac{1}{p} (1 - \frac{x_{n}^{p}}{a}) + \frac{p + 1}{2 p^{2}} {(1 - \frac{x_{n}^{p}}{a})}^{2}), n \geq 0.$ $$\begin{array}{} \displaystyle x_0\in D,\qquad x_{n+1} = x_n\left(1+\dfrac{1}{p}\left(1-\dfrac{x_n^p}{a}\right) + \dfrac{p+1}{2p^2}\left(1-\dfrac{x_n^p}{a}\right)^2\right),\quad n\geq 0. \end{array}$$(19)

As a consequence, the algorithm of Chebyshev’s method for solving $ℱ (X) = 0$ $\begin{array}{} \displaystyle \mathcal{F}(X)=0 \end{array}$, where $ℱ : Θ \to ℂ^{r \times r}$ $\begin{array}{} \displaystyle \mathcal{F}:\Theta\rightarrow\mathbb{C}^{r \times r} \end{array}$, is then

$X_{0} \in Θ, X_{n + 1} = X_{n} (I + \frac{1}{p} (I - A^{- 1} X_{n}^{p}) + \frac{p + 1}{2 p^{2}} {(I - A^{- 1} X_{n}^{p})}^{2}), n \geq 0.$ $$\begin{array}{} \displaystyle X_0\in\Theta,\quad X_{n+1} = X_n\left(I+\dfrac{1}{p}\left(I-A^{-1}X_n^p\right) + \dfrac{p+1}{2p^2}\left(I-A^{-1}X_n^p\right)^2\right),\quad n\geq 0. \end{array}$$(20)

Observe that we only need to calculate A⁻¹ and not $X_{n}^{- 1}$ $\begin{array}{} \displaystyle X_{n}^{-1} \end{array}$ at each step, so that algorithm (20) is more efficient than algorithm (16).

For Halley’s method, a similar strategy in order to avoid the use of inverses is not possible. Remember that the algorithm of Halley’s method,

$x_{0} \in D, x_{n + 1} = x_{n} - (\frac{1}{1 + \frac{1}{2} L_{f} (x_{n})}) \frac{f (x_{n})}{f' (x_{n})}, n \geq 0,$ $$\begin{array}{} \displaystyle {x_0} \in D,\qquad {x_{n + 1}} = {x_n} - \left( {\frac{1}{{1 + \frac{1}{2}{L_f}({x_n})}}} \right)\frac{{f({x_n})}}{{f'({x_n})}},\quad n \ge 0, \end{array}$$

always involves inverses (the quotient includes L_f (x_n)). This is the main advantage of Chebyshev’s method with respect to Halley’s method.

After that, we analyze the convergence of Chebyshev’s method. For this, we suppose AX₀ = X₀A and consider the following residual of algorithm (20):

$R (X_{n}) = I - A^{- 1} X_{n}^{p}, n \geq 0.$ $$\begin{array}{} \displaystyle R(X_n)=I-A^{-1}X_n^p,\quad n\geq 0. \end{array}$$

To prove the convergence of algorithm (20), we consider a submultiplicative matrix norm || ⋅ || defined in $ℂ^{r \times r}$ $\begin{array}{} \displaystyle \mathbb{C}^{r \times r} \end{array}$ and prove that {||R(X_n)|| is a scalar decreasing sequence convergent to zero. First of all, from AX₀ = X₀A, it follows AX_n = X_nA, $A X_{n}^{p} = X_{n}^{p} A$ $\begin{array}{} \displaystyle AX_n^{p}=X_n^{p}A \end{array}$ and $A^{- 1} X_{n}^{p} = X_{n}^{p} A^{- 1}$ $\begin{array}{} \displaystyle A^{-1}X_n^{p}=X_n^{p}A^{-1} \end{array}$. As a consequence, we have

$R (X_{n}) = I - (I - R (X_{n - 1})) {(I + \frac{1}{p} R (X_{n - 1}) + \frac{p + 1}{2 p^{2}} R {(X_{n - 1})}^{2})}^{p} .$ $$\begin{array}{} \displaystyle R({X_n}) = I - (I - R({X_{n - 1}})){\left( {I + \frac{1}{p}R({X_{n - 1}}) + \frac{{p + 1}}{{2{p^2}}}R{{({X_{n - 1}})}^2}} \right)^p}. \end{array}$$

Next, taking into account Villareal’s formula,

$P {(y)}^{p} = {(\sum_{i = 0}^{m} a_{i} y^{i})}^{p} = \sum_{i = 0}^{m p} P_{i} y^{i},$ $$\begin{array}{} \displaystyle P(y)^p=\left(\sum_{i=0}^m a_{i}y^{i}\right)^p=\sum_{i=0}^{mp}P_{i}y^{i}, \end{array}$$

where $P_{0} = a_{0}^{p}$ $\begin{array}{} \displaystyle P_{0}=a_{0}^p \end{array}$, $P_{i} = \sum_{j = 0}^{i - 1} P_{j} \frac{a_{i - j}}{a_{0}} \frac{(i - j) (p + 1) - i}{i}$ $\begin{array}{} \displaystyle {P_{i}=\sum_{j=0}^{i-1}P_{j}\,\frac{a_{i-j}}{a_{0}}\,\frac{(i-j)(p+1)-i}{i}} \end{array}$, for i = 1,2,...,mp, and a_i = 0, for all i ≥ m + 1, it follows that

$R (X_{n}) = I - (I - R (X_{n - 1})) \sum_{i = 0}^{2 p} P_{i} R {(X_{n - 1})}^{i},$ $$\begin{array}{} \displaystyle R(X_n) = I-(I-R(X_{n-1}))\sum_{i=0}^{2p} P_{i}R(X_{n-1})^{i}, \end{array}$$(21)

where a₀ = 1, $a_{1} = \frac{1}{p}$ $\begin{array}{} \displaystyle a_{1}=\dfrac{1}{p} \end{array}$, $a_{2} = \frac{p + 1}{2 p^{2}}$ $\begin{array}{} \displaystyle {a_2} = \frac{{p + 1}}{{2{p^2}}} \displaystyle \end{array}$, P₀ = 1 and $P_{i} = \sum_{j = 0}^{i - 1} P_{j} \frac{a_{i - j}}{a_{0}} \frac{(i - j) (p + 1) - i}{i}$ $\begin{array}{} \displaystyle \displaystyle{P_{i}=\sum_{j=0}^{i-1}P_{j}\,\frac{a_{i-j}}{a_{0}}\,\frac{(i-j)(p+1)-i}{i}} \end{array}$, for i = 1,2,...,2p, and a_i = 0, for all i ≥ 3. In addition, we have P₁ = P₂ = 1 and P₃ < 1.

Note that we can always obtain X_nA = AX_n. For this, it is enough to choose X₀ = I_r×r, as Guo does in [11]. Now, we establish the semilocal convergence of Chebyshev’s method in the following theorem.

Theorem 9

Let $A \in ℂ^{r \times r}$ $\begin{array}{} \displaystyle A\in\mathbb{C}^{r \times r} \end{array}$and X₀ ε Θ such that AX₀ = X₀A and $‖ R (X_{0}) ‖ = ‖ I - A^{- 1} X_{0}^{p} ‖ < 1$ $\begin{array}{} \displaystyle \|R(X_0)\|=\|I-A^{-1}X_0^p\|<1 \end{array}$. Suppose that ${P_{i}}_{i = 3}^{2 p}$ $\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$is a nonincreasing sequence such that P_i are nonnegative for all i = 2,3,...,2p. Then, Chebyshev’s method defined in (20) converges to $A^{\frac{1}{p}}$ $\begin{array}{} \displaystyle A^{\frac{1}{p}} \end{array}$. Moreover, ||R(X_n)|| ≤ ||R(X₀)||³ⁿ, $n \in ℕ$ $\begin{array}{} \displaystyle n\in\mathbb{N} \end{array}$.

In addition, we observe in table 2 some representative values of p that help to see that the conditions of Theorem 9 are satisfied.

Table 2

Sequence ${P_{i}}_{i = 3}^{2 p}$ $\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$ for p = 2,3,5,7

p	${P_{i}}_{i = 3}^{2 p}$ $\begin{array}{} \displaystyle \{P_{i}\}_{i=3}^{2p} \end{array}$
2	0.375, 0.1406...
3	0.4814..., 0.2222..., 0.0493..., 0.0109...
5	0.56, 0.296, 0.1059..., 0.0355..., 0.0080..., 0.0017..., 2.0736... × 10⁻⁴, 2.4883... × 10⁻⁵
7	0.5918..., 0.3294..., 0.1345..., 0.0512..., 0.0151..., 0.0041..., 0.0008..., 0.0001..., 2.6281... × 10⁻⁵, 3.6251... × 10⁻⁶, 2.9592... × 10⁻⁷, 2.4157... × 10⁻⁸

From Theorem 9, we also deduce that Chebyshev’s method has R-order of convergence at least three.

In [6], we present a family of high-order iterative methods including both Newton and Chebyshev methods. We find algorithms of the family with better numerical behavior than Newton and Halley methods. These two algorithms are basically the iterative methods proposed in the literature to solve this problem.

Conclusions

In this paper, we have reviewed some important properties of the classical Chebyshev method. We have point out the possibility to prove its global convergence from its geometric interpretation. We have mentioned the richer dynamics of this scheme in comparison with Newton’s method. Finally, we have presented several applications where this high order method is a good alternative to Newton’s method. These applications include the approximation of quadratic equations, the approximation of the inverse of a matrix or the pth root of a matrix.

eISSN:: 2444-8656
Idioma:: Inglés

Calendario de la edición:: Volume Open
Temas de la revista:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

RSS Feed de revista

After notes on Chebyshev’s iterative method

Publicado en línea: 01 ene 2017

Páginas: 1 - 12

Recibido: 11 oct 2016

Aceptado: 01 ene 2017

DOI: https://doi.org/10.21042/AMNS.2017.1.00001

Palabras clave
Nonlinear equations, Chebyshev’s iterative method, geometric interpretation, global convergence, dynamics, applications

© 2017 S. Amat and S. Busquier, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Fig. 1

Fig. 2

After notes on Chebyshev’s iterative method

Publicado en línea: 01 ene 2017

Páginas: 1 - 12

Recibido: 11 oct 2016

Aceptado: 01 ene 2017

DOI: https://doi.org/10.21042/AMNS.2017.1.00001

Palabras claveNonlinear equations, Chebyshev’s iterative method, geometric interpretation, global convergence, dynamics, applications

© 2017 S. Amat and S. Busquier, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Fig. 1

Fig. 2

Palabras clave
Nonlinear equations, Chebyshev’s iterative method, geometric interpretation, global convergence, dynamics, applications