Received 10 May 2016; accepted 18 July 2016; published 21 July 2016
1. Introduction and Background
An important problem that arises in different areas of science and engineering is that of computing limits of sequences of vectors 1, where, the dimension N being very large in many applications. Such vector sequences arise, for example, in the numerical solution of very large systems of linear or nonlinear equations by fixed-point iterative methods, and are simply the required solutions to these systems. One common source of such systems is the finite-difference or finite-element discretization of continuum problems.
In most cases of interest, however, the sequences converge to their limits extremely slowly. That is, to approximate, with a reasonable prescribed level of accuracy, by, we need to consider very large values of m. Since the vectors are normally computed in the order it is clear that we have to compute many such vectors until we reach one that has acceptable accuracy. Thus, this way of app- roximating via the becomes very expensive computationally.
Nevertheless, we may ask whether we can do something with those that are already available, to somehow obtain new approximations to that are better than each individual available. The answer to this question is in the affirmative for at least a large class of sequences that arise from fixed-point iteration of linear and nonlinear systems of equations. One practical way of achieving this is by applying to the sequence a suitable convergence acceleration method (or extrapolation method).
Of course, in case does not exist, it seems that no use could be made of the. Now, if the sequence is generated by an iterative solution of a linear or nonlinear system of equations, it can be thought of as “diverging from” the solution of this system. We call the antilimit of in such a case. It turns out that vector extrapolation methods can be applied to such divergent sequences to obtain good approximations to the relevant antilimits, at least in some cases.
Two different types of vector extrapolation methods exist in the literature:
1) Polynomial type methods: The minimal polynomial extrapolation (MPE) of Cabay and Jackson  , the reduced rank extrapolation (RRE) of Kaniel and Stein  , Eddy  , and Mešina  , and the modified minimal polynomial extrapolation (MMPE) of Brezinski  , Pugachev  and Sidi, Ford, and Smith  .
2) Epsilon algorithms: The scalar epsilon algorithm (SEA) of Wynn  (which is actually a recursive procedure for implementing the transformation of Shanks  ), the vector epsilon algorithm (VEA) of Wynn  , and the topological epsilon algorithm (TEA) of Brezinski  .
The paper by Smith, Ford, and Sidi  gives a review of all these methods (except MMPE) that covers the developments in vector extrapolation methods until the end of the 1970s. For up-to-date reviews of MPE and RRE, see Sidi  and  . Numerically stable algorithms for implementing MPE and RRE are given in Sidi  , these algorithms being also economical both computationally and storagewise. Jbilou and Sadok  have developed an analogous algorithm for MMPE along the lines suggested in Sidi, Ford, and Smith  and Sidi  . For the convergence properties and error analyses of MPE, RRE, MMPE, and TEA, as these are applied to vector sequences generated by fixed-point iterative methods from linear systems, see the works by Sidi  -  , Sidi, Ford, and Smith  , Sidi and Bridger  , and Sidi and Shapira   . VEA has been studied by Brezinski   , Gekeler  , Wynn   , and Graves-Morris   .
Vector extrapolation methods are used effectively in various branches of science and engineering in acceler- ating the convergence of iterative methods that result from large sparse systems of equations.
All of these methods have the useful feature that their only input is the vector sequence whose convergence is to be accelerated, nothing else being needed. In most applications, however, the polynomial type methods, especially MPE and RRE, have proved to be superior convergence accelerators; they require much less computation than, and half as much storage as, the epsilon algorithms for the same accuracy.
In this work, we develop yet another polynomial type method, which is based on the singular value decomposition (SVD), as well as some ideas that lead to MPE. We denote this new method by SVD-MPE. We also design a numerically stable algorithm for its implementation, whose computational cost and storage requirements are minimal. The new method is described in the next section. In Section 3, we show how the error in the approximation produced by SVD-MPE can be estimated at zero cost in terms of the quantities already used in the construction of the approximation. In Section 4, we give a very efficient algorithm for implementing SVD-MPE. In Section 5, we derive determinant representations for the approximations produced by SVD-MPE, while in Section 6, we show that this method is a Krylov subspace method when applied to vector sequences that result from the solution of linear systems via fixed-point iterative schemes. Finally, in Section 7, we illustrate its use with two numerical examples.
Before closing, we state the (reduced version of) the well known singular value decomposition (SVD) theorem. For different proofs, we refer the reader to Golub and Van Loan  , Horn and Johnson  , Stoer and Bulirsch  , and Trefethen and Bau  , for example.
Theorem 1.1 Let,. Then there exist unitary matrices, , and a diagonal matrix, with, such that
Furthermore, if and, then
In case, there holds, and the rest of the are zero.
Remark: The are called the singular values of and the and are called the corresponding right and left singular vectors of, respectively. We also have
2. Development of SVD-MPE
In what follows, we use boldface lower case letters for vectors and boldface upper case letters for matrices. In addition, we will be working with general inner products and the norms induced by them: These are defined as follows:
・ In, with hermitian positive definite,
・ In, with hermitian positive definite,
Of course, the standard Euclidean inner product and the norm induced by it are obtained by letting in (2.1) and in (2.2); we will denote these norms by (we will denote by the identity matrix in every dimension).
2.1. Summary of MPE
We begin with a brief summary of minimal polynomial extrapolation (MPE). We use the ideas that follow to develop our new method.
Given the vector sequence in, we define
and, for some fixed n, define the matrices via
Clearly, there is an integer, such that the matrices, , are of full rank, but is not; that is,
(Of course, this is the same as saying that is a linearly independent set, but is not.) Following this, we pick a positive integer and let the vector be the solution to
This minimization problem can also be expressed as in
and, as is easily seen, is the standard least-squares solution to the linear system, which, when, is overdetermined, and generally inconsistent. With determined, set, and compute the scalars via
provided. Note that
as the approximation to, whether is the limit or antilimit of.
What we have so far is only the definition (or the theoretical development) of MPE as a method. It should not be taken as an efficient computational procedure (algorithm), however. For this topic, see  , where numerically stable and computationally and storagewise economical algorithms for MPE and RRE are designed for the case in which. A well documented FORTRAN 77 code for implementing MPE and RRE in a unified manner is also provided in  , Appendix B.
2.2. Development of SVD-MPE
We start by observing that the unconstrained minimization problem for MPE given in (2.7) can also be expressed as a superficially “constrained” minimization problem as in
For the SVD-MPE method, we replace this “constrained” minimization problem by the following actual constrained minimization problem:
With determined, we again compute via
provided, noting again that
Finally, we set
as the SVD-MPE approximation to, whether is the limit or antilimit of.
Of course, the minimization problem in (2.12) has a solution for. Let for this (optimal). Lemma 2.1 that follows next gives a complete characterization of and the (optimal).
Lemma 2.1 Let be the singular values of the matrix
ordered as in
and let be the corresponding right singular vectors of, that is,
Assuming that, the smallest singular value of, is simple, the (optimal) solution to the minimi- zation problem in (2.12) is unique (up to a multiplicative constant,), and is given as in
Proof. The proof is achieved by observing that, with,
so that the problem in (2.12) becomes
The (optimal) solution to this problem is and. We leave the details to the reader. □
In view of the nature of the solution for the (optimal) involving singular values and vectors, as described in Lemma 2.1, we call this new method SVD-MPE.
1) Recall that there exists a positive integer, such that, for, but
. Therefore, we have for all.
2) Of course, exists if and only if the (optimal) satisfies. In addition, by (2.13), the are unique when is simple.
Before we go on to the development of our algorithm in the next section, we state the following result concerning the finite termination property of SVD-MPE, whose proof is very similar to that pertaining to MPE and RRE given in  :
Theorem 2.2 Let be the solution to the nonsingular linear system, and let be the se- quence obtained via the fixed-point iterative scheme, with chosen arbitrarily. If k is the degree of the minimal polynomial of with respect to (equivalently, with respect to)2, then produced by SVD-MPE satisfies.
3. Error Estimation
We now turn to the problem of estimating at zero cost the error, whether is the limit or antilimit of. Here we assume that is the solution to the system of equations
and that the vector sequence is obtained via the fixed-point iterative scheme
being the initial approximation to the solution.
Now, if is some approximation to, then a good measure of the error in is the residual vector of, namely,
This is justified since. We consider two cases:
1) is linear; that is, , where and is nonsingular.
In this case, we have
and, therefore, by, satisfies
2) is nonlinear.
In this case, assuming that and expanding about, we have
where is the Jacobian matrix of the vector-valued function evaluated at. Recalling that, we rewrite this in the form
from which, we conclude that the vectors and satisfy the approximate equality
That is, for all large m, the sequence behaves as if it were being generated by an N-dimensional approximate linear system of the form through
where and In view of what we already know about for linear systems [from (3.1)], for nonlinear systems, close to convergence, we have
Remark: That the vector is the exact residual vector for from linear systems and a true app- roximate residual vector for from nonlinear systems was proved originally in Sidi  . was adopted in a subsequent paper by Jbilou and Sadok  and named the “generalized residual.” Despite sounding interesting, this name has no meaning and is also misleading. By expanding about the solution and retaining first order terms only, it can be shown that is actually a genuine approximation to the true residual vector when is nonlinear. There is nothing “generalized” about it.
Now, we can compute at no cost in terms of the quantities that result from our algorithm, without having to actually compute itself. Indeed, we have the following result on, which can be incor- porated in the algorithm for SVD-MPE that we discuss in the next section:
Theorem 3.1 Let be the smallest singular value of and let be the corresponding right singular vector. Then the vector resulting from satisfies
Proof. First, the solution to (2.12) is by (2.19). Next, letting, we have by (2.13). Consequently,
Thus, by Lemma 2.1, we have
which is the required result. □
4. Algorithm for SVD-MPE
We now turn to the design of a good algorithm for constructing numerically the approximation that results from SVD-MPE. We note that matrix computations in floating-point arithmetic must be done with care, and this is what we would like to achieve here.
In this section, we let and for simplicity. Thus,. Since there is no room for confu- sion, we will also use, , and to denote, , and, respectively.
As we have seen in Section 2, to determine, we need, the right singular vector of corresponding to its smallest singular value. Now, and can be obtained from the singular value decomposition (SVD) of. Of course, the SVD of can be computed by applying directly to the algori- thm of Golub and Kahan  , for example. Here we choose to apply SVD to in an indirect way, which will result in a very efficient algorithm for SVD-MPE that is economical both computationally and storagewise in an optimal way. Here are the details of the computation of the SVD of, assuming that:
1) We first compute the QR factorization of in the form
where is unitary (that is,) and is upper triangular with positive diagonal elements, that is,
Of course, we can carry out the QR factorizations in different ways. Here we do this by the modified Gram- Schmidt process (MGS) in a numerically stable way as follows:
1. Compute and.
2. For do
end do (i)
end do (j)
Note that the matrices and are obtained from and, respectively, as follows:
For MGS, see  , for example.
2) We next compute the SVD of: By Theorem 1.1, there exist unitary matrices,
and a square diagonal matrix,
In addition, since is nonsingular by our assumption that, we have that for all i. Consequently,.
3) Substituting (4.7) in (4.1), we obtain the following true singular value decomposition of:
Thus, , the singular values of, are also the singular values of, and, the corresponding right singular vectors of, are also the corresponding right singular vectors of. (Of course, the are corres- ponding left singular vectors of. Note that, unlike, , and, which we must compute for our alg- orithm, we do not need to actually compute because we do not need in our computations. The mere knowledge that the SVD of is as given in (4.8) suffices to conclude that is the required optimal solution to (2.12); we continue with the development of our algorithm from this point.)
Remark: Computing the SVD of a matrix by first forming its QR factorization, next com- puting the SVD of as, and finally setting, with was first suggested by Chan  .
With already determined, we next compute the as in (2.13); that is,
Next, by the fact that
and by (2.14), we can re-express in (2.15) in the form
where the are computed from the as in
Making use of the fact that, with
where the and the are exactly those that feature in (4.2) and (4.3), we next rewrite (4.10) as in
Thus, the computation of can be carried out economically as in
Of course, is best computed as a linear combination of the columns of, hence (4.14) is computed as in
It is clear that, for the computation in (4.14) and (4.15), we need to save both and.
This completes the design of our algorithm for implementing SVD-MPE. For convenience, we provide a systematic description of this algorithm in Table 1, where we also include the computation of the norm of
the exact or approximate residual vector, namely, , which is given in Theorem 3.1.
Note that the input vectors, need not be saved; actually, they are overwritten by, as the latter are being computed. As is clear from the description of MGS given above, we can overwrite the matrix simultaneously with the computation of and, the vector overwriting as soon as it is computed, that is, at any stage of the QR factorization, we store N-dimensional vectors in the memory. Since in our applications, the storage requirement of the matrix is negligible. So is the cost of computing the SVD of, and so is the cost of computing the -dimensional vector. Thus, for all practical purposes, the computational and storage requirements of SVD-MPE are the same as those of MPE.
Remark: If we were to compute the SVD of, namely, , directly- and not by 1) first carrying out the QR factorization of as, and 2) next computing the SVD of as, and 3) noting that without actually computing―then we would need to waste extra resources in carrying out the computation of This direct strategy will have either of the following consequences:
1) If we have storage limitations, then we would have to overwrite with. (Recall that both of these matrices are and hence they are large.) As a result, we would have to compute the, a second time in order to compute.
2) If we do not want to compute the vectors, a second time, then we would have to save, while computing the matrix in its singular value decomposition. Thus, we would need to save two matrices, namely, and in the core memory simultaneously. Clearly, this limits the size of k, the order of extrapolation hence the rate of acceleration, severely.
Clearly, the indirect approach we have taken here for carrying out the singular value decomposition of
Table 1. Algorithm for SVD-MPE.
enables us to save extra computing and storage very conveniently when N is very large.
5. Determinant Representations for SVD-MPE
In  and  , determinant representations were derived for the vectors that are produced by the vector extrapolation methods MPE, RRE, MMPE, and TEA. These representations have turned out to be very useful in the analysis of the algebraic and analytic properties of these methods. In particular, they were used for obtaining interesting recursion relations among the and in proving sharp convergence and stability theorems for them. We now derive two analogous determinant representations for produced by SVD-MPE.
The following lemma, whose proof can be found in  , Section 3, will be used in this derivation in Theorem 5.2.
Lemma 5.1 Let and be scalars and let the satisfy the linear system
Then, whether are scalars or vectors, there holds
provided. In case the are vectors, the determinant is defined via its ex- pansion with respect to its first row.
For convenience of notation, we will write
Then is the principal submatrix of obtained by deleting the last row and the last column of. In addition, is hermitian positive definite just like.
Theorem 5.2 that follows gives our first determinant representation for resulting from SVD-MPE and is based only on the smallest singular value of and the corresponding right singular vector.
Theorem 5.2 Define
and assume that
Then, provided, exists and has the determinant representation
where is the determinant defined as in (5.3) in Lemma 5.1 with the as in (5.4).
Proof. With as in (2.16), we start by rewriting (2.18) in the form
Invoking here, which follows from (2.19), and multiplying the resulting equality on the left by, we obtain
Dividing both sides of this equality by, and invoking (2.13), we have
which, by the fact that
is the same as
where we have invoked (5.4). We will be able to apply Lemma 5.1 to prove the validity of (5.6) if we show that, in (5.10), the equations with are linearly independent, or, equivalently, the first k rows of the matrix
are linearly independent. By the fact that
Invoking, we obtain
Since is the smallest eigenvalue of and since, it turns out that is positive definite, which guarantees that the first k rows of are linearly independent. This completes the proof. □
Remark: We note that the condition that in Theorem 5.2 is equivalent to the condition that
, which we have already met in Section 2.
The determinant representation given in Theorem 5.3 that follows is based on the complete singular value decomposition of, hence is different from that given in Theorem 5.2. Since there is no room for confusion, we will denote the singular values and right and left singular vectors and of by, and respectively.
Theorem 5.3 Let be as in (2.16), and let
be the singular value decomposition of; that is,
Then, has the determinant representation
where is the determinant defined as in (5.3) in Lemma 5.1 with the as in (5.11).
Proof. By Theorem 1.1,
By (2.16) and by the fact that, which follows from (2.19), and by the fact that, , which follows from (2.13), and by (5.14), we then have
But, by (5.11), (5.15) is the same as
Therefore, Lemma 5.1 applies with as in (5.11), and the result follows. □
6. SVD-MPE as a Krylov Subspace Method
In Sidi  , we discussed the connection of the extrapolation methods MPE, RRE, and TEA with Krylov subspace methods for linear systems. We now want to extend the treatment of  to SVD-MPE. Here we recall that a Krylov subspace method is also a projection method and that a projection method is defined uniquely by its right and left subspaces4. In the next theorem, we show that SVD-MPE is a bona fide Krylov subspace method and we identify its right and left subspaces.
Since there is no room for confusion, we will use the notation of Theorem 5.3.
Theorem 6.1 Let be the unique solution to the linear system which we express in the form
and let the vector sequence be produced by the fixed-point iterative scheme
Define the residual vector of via Let also be the approximation to prod- uced by SVD-MPE. Then the following are true:
1) is of the form
2) The residual vector of, namely, , is orthogonal to the subspace
Consequently, SVD-MPE is a Krylov subspace method for the linear system, with the Krylov subspace as its right subspace and
as its left subspace.
Proof. With the generated as above, we have
Upon substituting in this equality, we obtain (6.1).
To prove (6.2), we first recall that by (3.1). By this and by (5.15), the result in (6.2) follows. □
Remark: We recall that (see  ), when applied to linearly generated sequences as in Theorem 6.1, MPE, RRE, and TEA are mathematically equivalent to, respectively, the full orthogonalization method (FOM) of Arnoldi  , the generalized minimum residual method (GMR), the best implementation of it being GMRES by Saad and Schultz  , and the method of Lanczos  . In all these methods the right subspace is the k-dimensional Krylov subspace. The left subspaces are also Krylov subspaces, with 1) for FOM, 2) for GMR, and 3) for the method of Lanczos. One important point about the left subspaces of these three methods is that they expand as k increases, that is, the left subspace of dimension contains the left subspace of dimension k. As for SVD-MPE, its right subspace is also the k-dimensional Krylov subspace, which makes SVD-MPE a bona fide Krylov subspace method, and the left subspace is. Thus, the k-dimensional left subspace is not a Krylov subspace, since it is not contained in the left subspace of dimension, as the left singular vectors of are different from the left singular vectors of.
7. Numerical Examples
We now provide two examples that show the performance of SVD-MPE and compare SVD-MPE with MPE. In both examples, SVD-MPE and MPE are implemented with the standard Euclidean inner product and the norm induced by it. Thus, and throughout.
As we have already mentioned, a major application area of vector extrapolation methods is that of numerical solution of large systems of linear or nonlinear equations by fixed-point iterations. [Here is a possibly preconditioned form of.] For SVD-MPE, as well as all other poly- nomial methods discussed in the literature, the computation of the approximation to, the solution of, requires of the vectors to be stored in the computer memory. For systems of very large dimension N, this means that we should keep k at a moderate size. In view of this limitation, a practical strategy for systems of equations is cycling, for which n and k are fixed. Here are the steps of cycling:
C0) Choose integers, and and an initial vector.
C1) Compute the vectors [via].
C2) Apply SVD-MPE (or MPE) to the vectors, with end result.
C3) If satisfies accuracy test, stop.
Otherwise, set, and go to Step C1.
We will call each application of steps C1 - C3 a cycle, and denote by the that is computed in the rth cycle. We will also denote the initial vector in step C0 by. Numerical examples suggest that the se-
quence has very good convergence properties. (A detailed study of errors and convergence properties for MPE and RRE in the cycling mode is given in  and  .)
Note that the remark at the end of Section 4 is relevant to the implementation of SVD-MPE in the cycling mode when N, the dimension of the, is very large and storage is limited and, therefore, the size of k is limited as well.
Example 7.1 Consider the vector sequence obtained from, where
and is symmetric with respect to both main diagonals, and and is hermitian. Therefore, is diago- nalizable with real eigenvalues. The vector is such that the exact solution to is. We have, so that converges to.
Figure 1 shows the norms of the errors in, with fixed. Here. Note that all of the approximations make use of the same (infinite) vector sequence, and, practically speaking, we are looking at how the methods behave as. It is interesting to see that SVD-MPE and MPE behave almost the same. Although we have a rigorous asymptotic theory confirming the behavior of MPE in this mode as observed in Figure 1 (see   and  ), we do not have any such theory for SVD-MPE at the present5.
Figure 2 shows the norm of the error in in the cycling mode with and. Now, a relatively large dimension.
Example 7.2 We now apply SVD-MPE and MPE to the nonlinear system that arises from finite-difference approximation of the two-dimensional convection-diffusion equation considered in Kelley (  , pp. 108-109), namely,
where satisfies homogeneous boundary conditions. is constructed by setting in the differential equation and by taking
as the exact solution.
The equation is discretized on a square grid by approximating, , , and by centered differe- nces with truncation errors. Thus, letting, and
Figure 1.norm of error in, with, from MPE and SVD-MPE, for Example 7.1 with.
Figure 2. norm of error in in the cycling mode, from SVD-MPE, for Example 7.1 with.
we replace the differential equation by the finite difference equations
Here is the approximation to, as usual.
We first write the finite difference equations in a way that is analogous to the PDE written in the form
and split the matrix representing to enable the use of the Jacobi and Gauss-Seidel methods as the iterative procedures to generate the sequences.
Figure 3 and Figure 4 show the norms of the errors in from SVD-MPE and MPE in the cycling mode with and, the iterative procedures being, respectively, that of Jacobi and that of Gauss- Seidel for the linear part of the PDE. Here, so that the number of unknowns (the dimension) is. Note that the convergence of cycling is much faster with Gauss-Seidel iteration than with
Figure 3. norm of error in in the cycling mode, from MPE and SVD-MPE, for Example 7.2 with hence. The underlying iteration method is that of Jacobi.
Figure 4. norm of error in in the cycling mode, from MPE and SVD-MPE, for Example 7.2 with hence. The underlying iteration method is that of Gauss- Seidel.
Jacobi iteration6. For both cycling computations, SVD-MPE and MPE seem to perform very similarly in this example.
Note that the Jacobi and Gauss-Seidel iterations converge extremely slowly. In view of this slow conve- rgence, the acceleration produced by SVD-MPE and MPE in the cycling mode is remarkable.
The author would like to thank Boaz Ophir for carrying out the computations reported in Section 7 of this work.