Next: Inexact Methods K. Meerbergen Up: Preconditioning Techniques Previous: Preconditioning Techniques Contents Index

Introduction

The term preconditioning is often used in a numerical method as an extra step designed to accelerate the convergence. Preconditioning is an important technique in iterative methods for solving large systems of linear equations. Preconditioning for eigenvalue problems is not as straightforward and has been slower to catch on. There is a large set of literature in Russian and some of it is discussed in §11.3. Ruhe and Wiberg [380,373,374] used SOR and the conjugate gradient method to find eigenvalues in the 1970s. Around the same time, quantum chemists, including Davidson [99], developed correction methods for their large symmetric matrices. Later Scott recognized that Davidson's approach is a form of preconditioning, and generalizations were given in [335]. A recent development, the Jacobi-Davidson method, has been presented in detail in earlier chapters.

In this introduction, we attempt to point out some connections, first with some comparisons between preconditioning eigenvalues and linear equations. Then there is a brief discussion of preconditioned versus unpreconditioned methods. And finally, connections are given between various preconditioned eigensolvers.

In this paragraph, we look at how the task of preconditioning compares for eigenvalues versus linear equations. We give several reasons why preconditioning eigenvalue problems is more difficult. The matrix being preconditioned is generally nearly singular. Interior eigenvalues are difficult, similar to preconditioning indefinite systems of linear equations. Unlike preconditioned linear equations, preconditioned eigenvalue methods may need some sort of restarting, even in the symmetric case. However, restarting is often necessary even for unpreconditioned Krylov eigenvalue methods that compute the eigenvector. Finally, we mention that eigenvalue preconditioning methods tend to focus on computing one eigenvalue at a time. This is different from Krylov eigenvalue methods. There is no similar added difficulty in going from unpreconditioned to preconditioned linear equation methods.

In spite of these difficulties, which no doubt contributed to the slow development in preconditioning eigenvalue problems, there is much potential in situations where a good preconditioner can be found. A number of factors should go into the decision as to whether a preconditioning method is best. These include the effectiveness of the preconditioner, the number of eigenvalues desired, whether eigenvectors are needed, and the competing methods in the particular situation. Generally, preconditioning is more competitive if not too many eigenvalues are desired and if computing the eigenvectors is required. We also want to mention that finding eigenvalues far in the interior of the spectrum is so tough for Krylov methods that either preconditioning or complete factorization of a shifted matrix must be considered. Mesh eigenproblems (see §11.3.1) is one important practical example of a class of eigenproblems, where preconditioning seems particularly promising.

The rest of this introduction looks at similarities between the various preconditioning methods that will be described in this chapter. It was already mentioned that preconditioning methods generally find just one eigenvalue at a time. This is quite different from an unpreconditioned Krylov subspace method, which can find several eigenvalues simultaneously. Sometimes a block approach is used to alleviate this problem for a preconditioning method; e.g., see §11.3.9. Another similarity among preconditioning methods is that global convergence may be trickier. Some methods, e.g., the locally optimal preconditioned conjugate gradient method of §11.3.8, are practically very robust and typically converge with a random initial guess in numerical experiments. However, some other methods, e.g., based on inner-outer iterations, work better if reasonable approximations to eigenvectors are available as starting vectors. This leads us to the topic of hybrid methods, with a Krylov approach to start with and a preconditioning approach to follow. In fact, a number of operators could be used for the Krylov method including , $M^{-1}$ (where is an approximation to a shifted matrix $A-\nu I$ ), and even the preconditioned operator $M^{-1}(A-\nu I)$ , for $\nu$ and fixed. Hybrid methods could deliver more robustness, and they deserve further examination.

We look at one more similarity among eigenvalue preconditioning methods. For ease of presenting, we will assume there is a standard symmetric eigenvalue problem. It appears that common to all methods is the operator

$\begin{displaymath} M^{-1}(A-\nu I), \end{displaymath}$

(272)

where $\nu$ is an approximate eigenvalue and

is a (possibly changing) approximation to $A-\nu I$ . We will look at some examples of how different methods use this operator here; for a general discussion see §11.3.2. The four methods discussed are the preconditioned power method, the Rayleigh quotient iteration (RQI), the Davidson method, and the Jacobi-Davidson method. The preconditioned power method is Algorithm 11.5 in §11.3. With

, steps 4 and 5 of this algorithm perform

$\begin{displaymath}w = T(x^{(i)}-\mu^{(i)}Ax^{(i)}) = -\mu^{(i)} T(A - {1/\mu^{(i)}} I) x^{(i)}. \end{displaymath}$

Letting $T=M^{-1}$ and ${1/\mu^{(i)}} = \nu$ , we see the operator (11.1). The other algorithms in §11.3 have something similar as operator.

Next, the RQI with the preconditioned conjugate gradient method (PCG) of §11.3.7 has the operator $M^{-1}(A-\rho I)$ in the inner iteration of PCG, where $\rho$ is the Rayleigh quotient and is the preconditioner. And Davidson's original method, Algorithm 11.2 of §11.2.4 (see also §11.3.6) for standard eigenvalue problems, uses $(D-\nu I)^{-1}(A-\nu I)$ , where is the diagonal of and $\nu$ is the current approximate eigenvalue that is found by the Rayleigh-Ritz procedure. Again this fits the form of (11.1). So if the same preconditioning is used, RQI (with PCG) and the Davidson method have the same operator at the core of the method. The difference is that the Davidson method changes $\nu$ at every step, while RQI has $\rho$ fixed during a run of PCG, and also the Davidson method uses the vectors it generates to solve an eigenvalue problem, while RQI uses them for solving linear equations. The linear equations in RQI are an intermediate step toward the eventual goal of solving the eigenvalue problem.

Finally we look at Jacobi-Davidson method; see §11.2.5 and §11.3.7. What is interesting is that the operator (11.1) shows up twice. First, in the outer iteration, from the Jacobi-Davidson correction equation (4.49), letting $M = (I-zz^*)(A-\theta I)(I-zz^*)$ , we have $t \approx -M^{-1}r = -M^{-1}(A-\theta I)z$ . Second and more importantly, in the inner iteration of generating the approximate solution to $t = -M^{-1}r$ using a preconditioned Krylov subspace iterative method, the operator is $P^{-1}(I-zz^*)(A-\theta I)(I-zz^*)$ , where is the preconditioner for $(I-zz^*)(A-\theta I)(I-zz^*)$ . This operator is similar to (11.1), with an added deflation of . The subspace generated by this preconditioned operator is used for the solution of linear equations.

With many eigenvalue preconditioning methods using at their core closely related operators, results with the same preconditioner might be expected to be similar. This is not the case. The implementation particular to each method can make a big difference, both in convergence rates and in expenses. For example, it will be discussed in the next section that inexact Cayley transforms (and the Jacobi-Davidson method) of §11.2.5 are much better than inexact shift-and-invert Krylov methods, even though they are similar. This is due partly to the fact that the $M^{-1}(A-\nu I)$ operator in the shift-and-invert method has $\nu$ fixed, instead of approaching an eigenvalue (the $\nu$ is changed in the inexact rational Krylov method, also mentioned in the next section). We can say that all the preconditioning methods share a limitation. They can only be as effective as $M^{-1}(A-\nu I)$ allows them to be. This depends on how good a preconditioner is.

The Davidson method is perhaps the purest preconditioning method in the sense that with every application of the operator, it uses the best information known (new approximations to the desired eigenpair), and it applies the vectors it generates directly to an eigenvalue problem. On the other hand, there is significant expense for every iteration. Meanwhile, for some other methods, including RQI and Jacobi-Davidson, the inner iteration of a preconditioned linear equations iterative method can be more efficient in both expense and storage. Jacobi-Davidson has this efficiency and also greater robustness than RQI because it uses a subspace for the eigenvalue problem instead of one vector, and because it uses the residual of an approximate eigenvector as the right-hand side of the linear equations, instead of the vector itself.

Finally, the simultaneous PCG method, Algorithm 11.11 of §11.3.9, the most promising method of §11.3, has advantages of fast convergence of the Davidson method without its significant cost and applies the PCG directly to an eigenvalue problem, thus removing any need for inner iterations. Numerical experiments show great practical robustness of the method with respect to initial approximations, a particular choice of the preconditioner, and condition number of the matrix .

The two sections that follow both look at some of the details of preconditioning eigenvalue problems, but with some different insights.

Next: Inexact Methods K. Meerbergen Up: Preconditioning Techniques Previous: Preconditioning Techniques Contents Index

Susan Blackford 2000-11-20