Eigenvector Computation with Spectral Transformation

Next: Software Availability Up: Implicitly Restarted Arnoldi Method Previous: Locking and Purging in Contents Index

Eigenvector Computation with Spectral Transformation

The SI was introduced in §3.3. Specifically, we use the matrix

$\begin{displaymath} C \equiv (A - \sigma I)^{-1} \end{displaymath}$

in place of

and rely on the fact that

$\begin{displaymath} C x = x \theta \iff A x = x \left(\sigma + \frac{1}{\theta}\right) \end{displaymath}$

to recover eigenvalues of the original matrix. The Arnoldi method is extremely effective when used in conjunction with SI. Convergence to eigenvectors corresponding to the eigenvalues of

that are nearest to $\sigma$ is generally very rapid. Hence, this approach should be used whenever it is possible to solve linear systems efficiently with $A-\sigma I$ as the coefficient matrix.

When a spectral transformation is used, additional considerations should be made with respect to stopping criteria to take advantage of the special nature of the transformed operator . Moreover, the quality of the approximate eigenvectors can be improved significantly with a minor amount of postprocessing. In order to compute eigenvalues of near to $\sigma$ we will compute the eigenvalues $\theta$ of that are of largest magnitude. It is not really necessary for $\sigma$ to be extremely close to a desired eigenvalue. However, for the following discussion it is worth keeping in mind that in practice, it is typical to take $\sigma$ near to desired eigenvalues, and hence it is typical for $\vert\theta\vert \gg 1$ to hold.

Let with $H_m y = y \theta$ , where $\Vert y \Vert _2=1$ . Since

$\begin{displaymath} (A-\sigma I)^{-1}x - x \theta = CV_my - V_mH_my = f_m \, e^\ast_my, \end{displaymath}$

(129)

it follows that $\Vert f_m\Vert\,\vert e^\ast_my\vert$ is the norm of the residual $\Vert Cx - x \theta \Vert$ . If $\Vert f_m\Vert\,\vert e^\ast_my\vert \leq \epsilon_U$ , where $\epsilon_U$ is a user-specified tolerance, then according to (7.17),

and $\theta$ are an exact eigenpair for a matrix near

. However, we are really interested in eigenvalue-eigenvector approximations for the original matrix

A simple rearrangement of (7.23) gives

$\begin{displaymath} Ax - x\left(\sigma + \frac{1}{\theta}\right) = -(A - \sigma I) f_m \frac{e^\ast_my}{\theta}, \end{displaymath}$

(130)

and this is the residual of the computed eigenpair $(x , \sigma + \frac{1}{\theta})$ for the matrix

. However, with a minor amount of additional arithmetic, the quality of the eigenvector can be improved and the corresponding residual norm can be reduced significantly.

Adding and subtracting $f_m \frac{e^\ast_my}{\theta^2}$ on the right-hand side of (7.24) and rearranging terms will result in

$\begin{displaymath} Az - z \left(\sigma + \frac{1}{\theta}\right) = -f_m \frac{e^\ast_my}{\theta^2} , \end{displaymath}$

(131)

where $z \equiv x + f_m ( e^\ast_m y/\theta)$ . In the case $\vert\theta\vert \gg 1$ , we see that

will be a much better approximate eigenvector than

. Moreover, if $\Vert A\Vert$ is large relative to $\vert\sigma\vert$ , then the potential magnification of error due to the term $(A - \sigma I) f_m$ is avoided.

A heuristic argument further supports the use of instead of . From (7.23) it follows that

$\begin{displaymath} (A - \sigma I)^{-1}x = x\theta + f_m e^\ast_m y, \end{displaymath}$

(132)

and hence $z = x + f_m ( e^\ast_m y/\theta)$ is the (normalized) vector that would result if we performed one step of inverse iteration with $A-\sigma I$ on

This vector has not yet been scaled to have unit norm. However, $\Vert z \Vert^2 = 1 + \left(\vert e^\ast_m y \vert\Vert f_m \Vert/\vert\theta\vert\right)^2$ , so the error bound will decrease after is normalized. Moreover, if ${\vert e^\ast_m y \vert\Vert f_m \Vert}/{\vert\theta\vert} < \sqrt{\epsilon_M}$ , then the floating point computation of the norm will already result in $\Vert z\Vert = 1$ without any rescaling.

From (7.23), the residual $C x - x \theta$ is orthogonal to the Krylov space spanned by the columns of . However, this Galerkin condition is lost upon transforming the computed eigenpair to the original system, regardless of whether we use or . This is because $\sigma +1/\theta$ is not the Rayleigh quotient associated with or . However, from (7.25)

$\begin{displaymath} \left\vert \frac{z^\ast A z}{ z^\ast z} - \left(\sigma +\fra... ...\vert e_m^* y \vert }{\vert\theta\vert\Vert z \Vert}\right)^2. \end{displaymath}$

(133)

In other words, the approximate eigenvalue $\sigma +1/\theta$ is nearly a Rayleigh quotient for

when the vector

is used as the approximate eigenvector. In fact, $\sigma +1/\theta$ will be within roundoff level of being a Rayleigh quotient when $\frac{\vert e^\ast_m y \vert\Vert f_m \Vert}{\vert\theta\vert} < \sqrt{\epsilon_M}$ .

On the other hand, from (7.24) we deduce that

$\begin{displaymath} \left\vert x^\ast A x - \left(\sigma + \frac{1}{\theta}\rig... ...\Vert f_m\Vert\vert e^\ast_my \vert}{\vert\theta\vert} \right) \end{displaymath}$

(134)

is the error in using $\sigma +1/\theta$ as a Rayleigh quotient for

when

is used.

Equations (7.25) and (7.27) imply that the vector is a better approximation than to the eigenvector associated with the approximate eigenvalue $\sigma +1/\theta$ provided that $\vert\theta\vert$ is greater than 1. Moreover, when $\vert\theta\vert \gg 1$ , only a moderately small Ritz estimate is needed to achieve an acceptably small direct residual and Rayleigh quotient error. If the $\sigma$ is near the desired eigenvalues, then these eigenvalues are mapped by to large eigenvalues and typically $\vert\theta\vert \gg 1$ .

The above analysis is based on that given by Ericsson and Ruhe [162] for the generalized symmetric definite eigenvalue problem.

Next: Software Availability Up: Implicitly Restarted Arnoldi Method Previous: Locking and Purging in Contents Index

Susan Blackford 2000-11-20