next up previous contents index
Next: Software Availability Up: Implicitly Restarted Arnoldi Method Previous: Locking and Purging in   Contents   Index


Eigenvector Computation with Spectral Transformation

The SI was introduced in §3.3. Specifically, we use the matrix

\begin{displaymath}
C \equiv (A - \sigma I)^{-1}
\end{displaymath}

in place of $A$ and rely on the fact that

\begin{displaymath}
C x = x \theta \iff A x = x \left(\sigma + \frac{1}{\theta}\right)
\end{displaymath}

to recover eigenvalues of the original matrix. The Arnoldi method is extremely effective when used in conjunction with SI. Convergence to eigenvectors corresponding to the eigenvalues of $A$ that are nearest to $\sigma$ is generally very rapid. Hence, this approach should be used whenever it is possible to solve linear systems efficiently with $A-\sigma I$ as the coefficient matrix.

When a spectral transformation is used, additional considerations should be made with respect to stopping criteria to take advantage of the special nature of the transformed operator $C$. Moreover, the quality of the approximate eigenvectors can be improved significantly with a minor amount of postprocessing. In order to compute eigenvalues of $A$ near to $\sigma$ we will compute the eigenvalues $\theta $ of $C$ that are of largest magnitude. It is not really necessary for $\sigma$ to be extremely close to a desired eigenvalue. However, for the following discussion it is worth keeping in mind that in practice, it is typical to take $\sigma$ near to desired eigenvalues, and hence it is typical for $\vert\theta\vert \gg 1$ to hold.

Let $x = V_m y$ with $H_m y = y \theta$, where $\Vert y \Vert _2=1$. Since

\begin{displaymath}
(A-\sigma I)^{-1}x - x \theta = CV_my - V_mH_my = f_m \, e^\ast_my,
\end{displaymath} (129)

it follows that $\Vert f_m\Vert\,\vert e^\ast_my\vert$ is the norm of the residual $\Vert Cx - x \theta \Vert$. If $\Vert f_m\Vert\,\vert e^\ast_my\vert \leq \epsilon_U$, where $\epsilon_U$ is a user-specified tolerance, then according to (7.17), $x$ and $\theta $ are an exact eigenpair for a matrix near $C$. However, we are really interested in eigenvalue-eigenvector approximations for the original matrix $A$.

A simple rearrangement of (7.23) gives

\begin{displaymath}
Ax - x\left(\sigma + \frac{1}{\theta}\right) =
-(A - \sigma I) f_m \frac{e^\ast_my}{\theta},
\end{displaymath} (130)

and this is the residual of the computed eigenpair $(x , \sigma + \frac{1}{\theta})$ for the matrix $A$. However, with a minor amount of additional arithmetic, the quality of the eigenvector can be improved and the corresponding residual norm can be reduced significantly.

Adding and subtracting $f_m \frac{e^\ast_my}{\theta^2} $ on the right-hand side of (7.24) and rearranging terms will result in

\begin{displaymath}
Az - z \left(\sigma + \frac{1}{\theta}\right) =
-f_m \frac{e^\ast_my}{\theta^2} ,
\end{displaymath} (131)

where $z \equiv x + f_m ( e^\ast_m y/\theta)$. In the case $\vert\theta\vert \gg 1$, we see that $z$ will be a much better approximate eigenvector than $x$. Moreover, if $\Vert A\Vert$ is large relative to $\vert\sigma\vert$, then the potential magnification of error due to the term $(A - \sigma I) f_m$ is avoided.

A heuristic argument further supports the use of $z$ instead of $x$. From (7.23) it follows that

\begin{displaymath}
(A - \sigma I)^{-1}x = x\theta + f_m e^\ast_m y,
\end{displaymath} (132)

and hence $z = x + f_m ( e^\ast_m y/\theta) $ is the (normalized) vector that would result if we performed one step of inverse iteration with $A-\sigma I$ on $x$.

This vector $z$ has not yet been scaled to have unit norm. However, $\Vert z \Vert^2 = 1 + \left(\vert e^\ast_m y \vert\Vert f_m \Vert/\vert\theta\vert\right)^2 $, so the error bound will decrease after $z$ is normalized. Moreover, if ${\vert e^\ast_m y \vert\Vert f_m \Vert}/{\vert\theta\vert} < \sqrt{\epsilon_M} $, then the floating point computation of the norm will already result in $\Vert z\Vert = 1$ without any rescaling.

From (7.23), the residual $C x - x \theta$ is orthogonal to the Krylov space spanned by the columns of $V_m$. However, this Galerkin condition is lost upon transforming the computed eigenpair to the original system, regardless of whether we use $x$ or $z$. This is because $\sigma +1/\theta$ is not the Rayleigh quotient associated with $x$ or $z$. However, from (7.25)

\begin{displaymath}
\left\vert \frac{z^\ast A z}{ z^\ast z} - \left(\sigma +\fra...
...\vert e_m^* y \vert }{\vert\theta\vert\Vert z \Vert}\right)^2.
\end{displaymath} (133)

In other words, the approximate eigenvalue $\sigma +1/\theta$ is nearly a Rayleigh quotient for $A$ when the vector $z$ is used as the approximate eigenvector. In fact, $\sigma +1/\theta$ will be within roundoff level of being a Rayleigh quotient when $\frac{\vert e^\ast_m y \vert\Vert f_m \Vert}{\vert\theta\vert} < \sqrt{\epsilon_M} $.

On the other hand, from (7.24) we deduce that

\begin{displaymath}
\left\vert x^\ast A x - \left(\sigma + \frac{1}{\theta}\rig...
...\Vert f_m\Vert\vert e^\ast_my \vert}{\vert\theta\vert} \right)
\end{displaymath} (134)

is the error in using $\sigma +1/\theta$ as a Rayleigh quotient for $A$ when $x$ is used.

Equations (7.25) and (7.27) imply that the vector $z$ is a better approximation than $x$ to the eigenvector associated with the approximate eigenvalue $\sigma +1/\theta$ provided that $\vert\theta\vert$ is greater than 1. Moreover, when $\vert\theta\vert \gg 1$, only a moderately small Ritz estimate is needed to achieve an acceptably small direct residual and Rayleigh quotient error. If the $\sigma$ is near the desired eigenvalues, then these eigenvalues are mapped by $C$ to large eigenvalues and typically $\vert\theta\vert \gg 1$.

The above analysis is based on that given by Ericsson and Ruhe [162] for the generalized symmetric definite eigenvalue problem.


next up previous contents index
Next: Software Availability Up: Implicitly Restarted Arnoldi Method Previous: Locking and Purging in   Contents   Index
Susan Blackford 2000-11-20