next up previous contents index
Next: Storage Requirements and Floating Up: Block Lanczos Methods   Previous: Basic Algorithm   Contents   Index

An Adaptively Blocked Lanczos Method

Algorithm 7.14 breaks down if $P^{\ast}_{j+1} Q_{j+1}$ is singular. Moreover, the computed Ritz triplets can converge slowly due to either the loss of biorthogonality among the computed Lanczos vectors $Q_{[j]}$ and $P_{[j]}$ or a block size that is smaller than the size of the cluster of the desired eigenvalues. ABLE attempts to avoid breakdowns and eliminates these causes of slow convergence by adaptively changing the block size and maintaining the full biorthogonality of the Lanczos vectors, i.e., $\Vert P^{\ast}_{[j]} Q_{[j]} - I \Vert \approx \epsilon_M$, or semi-biorthogonality, i.e., $\Vert P^{\ast}_{[j]} Q_{[j]} - I \Vert \approx \sqrt{\epsilon_M}$.

Algorithm 7.15 is a pseudocode for ABLE. The method can be executed in different increasingly complex levels indicated by the following logical parameters:

uses the two-sided Gram-Schmidt process to maintain full biorthogonality at each Lanczos step.

monitors the loss of biorthogonality and corrects the loss of biorthogonality at certain steps.

cures the breakdown by increasing the block size.

adapts the block size to the order of a cluster of converged Ritz values.
There are four levels of implementation of ABLE:
Level 1:
All logical flags are false (F).

This is the most simple block Lanczos algorithm as in Algorithm 7.14. Essentially, the three-term recurrences (7.50) and (7.51) are used and it does not store the computed Lanczos vectors. Only eigenvalues are computed. The user specifies the parameter tolbd for detecting breakdown. The default value for tolbd is $10n\epsilon_M$, where $\epsilon_M$ denotes the machine precision.

Level 2:
fullbo = true (T) and other logical flags are false (F).

This version maintains full biorthogonality. Each new pair of Lanczos vectors computed by the three-term recurrences is re-biorthogonalized against all previous Lanczos vectors by TSMGS. Eigentriplets are computed for Levels $\geq 2$.

Level 3:
semibo = true (T) and other logical flags are false (F).

This level attempts to maintain semi-biorthogonality. It is less expensive in both floating point operations and memory references than full biorthogonality.

Level 4:
fullbo = true (T) or semibo = true (T) and treatbd = true (T) and/or group = true (T).

This version attempts to cure breakdowns by increasing the block size and/or by adapting the block size to be at least the order of any detected cluster of converged Ritz values. The user specifies the parameters tolbd and tolcl for declaring breakdown and/or grouping a cluster of Ritz values. The default values for tolcl and tolbd are $\epsilon^{1/2}_M$.

The user may select an implementation of ABLE with either full biorthogonality (Level 2) or semi-biorthogonality (Level 3), but not both. Level 4 is implemented on top of Levels 2 or 3 (full or semi-biorthogonality). The default values for the logical flags are all true except fullbo = false.

At Level 1, after a Ritz value converges to an eigenvalue of $A$, copies of this Ritz value appear at later Lanczos steps. For example, a cluster of Ritz values of the reduced tridiagonal matrix, $T_{[j]}$, may approximate a single eigenvalue of the original matrix $A$. A heuristic device to detect the spurious Ritz values is proposed in [93] and discussed in §7.8. Such phenomena are not anticipated in the higher levels of ABLE.

\begin{algorithm}{ABLE for NHEP
...oximate eigenvectors $x^{(j)}_i$\ and $y^{(j)}_i$\end{tabbing}}

We now comment on some steps of Algorithm 7.15:

The comment on step (1) of Algorithm 7.14 applies here as well.

(2), (3), (35), (36)
The comment on steps (2), (3) and (20), (21) of Algorithm 7.14 applies here equally well.

(6), (7)
The following step for maintaining local biorthogonality with the block vectors $Q_{j}$ and $P_{j}$ may be inserted after step (7):



If $R$ or $S$ is rank deficient, then $Q_{j+1}$ and $P_{j+1}$ are biorthogonalized in place against the previous Lanczos vectors $Q_{[j]}$ and $P_{[j]}$ using TSMGS.

If the smallest singular value of $P^{\ast}_{j+1} Q_{j+1}$ is less than the prescribed tolerance value tolbd, the breakdown flag is switched on. $\delta_d$ denotes the number of the singular values less than tolbd. If a breakdown occurs and no treatment is specified by the user (i.e., treatbd = false), then the method fails.

The order, $\delta_c$, of the largest group of clustered values from converged Ritz values $\{ \theta^{(j)}_i \}$ is determined in this step. Two Ritz values $\theta^{(j)}_i$ and $\theta^{(j)}_k$ are regarded as clustered if $\vert\theta^{(j)}_i - \theta^{(j)}_k\vert \leq {\tt tolcl} \cdot \vert\theta^{(j)}_i\vert$, where tolcl is the user-prescribed tolerance.

If breakdown occurs or if the order of a cluster of converged Ritz values exceeds the current block size, $\delta_c > p_j$, $Q_{j+1}$ and $P_{j+1}$ can be augmented:

Q_{j+1} := \onebytwo{Q_{j+1}}{\widehat{Q}}, \quad
P_{j+1} := \onebytwo{P_{j+1}}{\widehat{P}},

such that $P^{\ast}_{j+1} Q_{j+1}$ is numerically nonsingular, where $\widehat{Q}$ and $\widehat{P}$ are $n \times \delta$, $\delta$ is determined by $\delta = \min ( \max( \delta_c, \delta_d ), p_{\max} - p_j )$, where $p_j$ is the block size before argument and $p_{\max}$ is the prescribed maximum block size. The new block size is $p_j+\delta$. TSMGS is then invoked to biorthogonalize $Q_{j+1}$ and $P_{j+1}$ against previous Lanczos vectors stored in $Q_{[j]}$ and $P_{[j]}$. This is the so-called adaptive blocking scheme.

No feasible method is known to select $\widehat{Q}$ and $\widehat{P}$ that are guaranteed to lift the smallest singular values of $P^{\ast}_{j+1} Q_{j+1}$ above tolbd, except in the case of tolbd = 0, i.e., the exact breakdown. One may choose $\widehat{Q}$ and $\widehat{P}$ as random vectors.

(28), (29)
If semi-biorthogonality is maintained, we solve the eigenproblem of $T_{[j]}$ after each re-biorthogonalization.

The formulas (7.56) and (7.57) can be used to detect convergence. Since we expect too many steps in the Lanczos iteration, at Level 1, we use the following convergence test:

\min\{ \Vert r^{(j)}_i\Vert, \Vert(s^{(j)}_i)^{\ast}\Vert \} \leq
\mbox{\tt tolconv} \cdot \Vert A\Vert _1,

At the higher levels we may use the semiquadratic convergence test:

\min\left\{ \Vert r^{(j)}_i\Vert, \Vert(s^{(j)}_i)^{\ast}\Ve...
...j)}_i \right\}
\leq \mbox{\tt tolconv} \cdot \Vert A\Vert _1,


\eta^{(j)}_i = \frac{ \Vert r^{(j)}_i\Vert\; \Vert(s^{(j)}_i)^{\ast}\Vert _2}
{{\rm gap}(\theta^{(j)}_i,T_{[j]})};

$\mbox{gap}(\theta^{(j)}_i, T_{[j]})$ defines the gap between $\theta^{(j)}_i$ and other Ritz values. Specifically, ${\rm gap}(\theta^{(j)}_i,T_{[j]})=
\min_{k \ne i} \vert \theta^{(j)}_i - \theta^{(j)}_k\vert$. In the blocked case, the gap is defined between the clusters.

In [29] it is shown that under mild conditions, for a Ritz value $\theta^{(j)}_i$, there is an eigenvalue of $A$, such that $\vert \lambda - \theta^{(j)}_i \vert$ is proportional to the product of the condition number and the quantity $\eta^{(j)}_i$. However, for ill-posed problems (i.e., large condition number), small residuals (backward errors) do not imply high eigenvalue accuracy (small forward error). In this case, the above semiquadratic convergence criterion is optimistic. In any case, the right and left eigenvectors can be used to approximate eigenvalue condition numbers. This detects ill-conditioning in an eigenvalue problem (see §7.13).

If full biorthogonality is required (Level 2), simply use the TSMGS procedure (see (10)) to biorthogonalize $Q_{j+1}$ and $P_{j+1}$ against all previous Lanczos vectors stored in $Q_{[j]}$ and $P_{[j]}$. If semi-biorthogonality is required (Level 3), the loss of biorthogonality is monitored and, if necessary, TSMGS is invoked to maintain semi-biorthogonality.

The biorthogonality of the $(j+1)$st Lanczos vectors $Q_{j+1}$ and $P_{j+1}$ to the previous Lanczos vectors is measured by the quantity

d_{j+1} = \max \left\{ \frac{\Vert P_{[j]}^{\ast}Q_{j+1} \Ve...
... _1}
{\Vert Q_{[j]}\Vert _1 \Vert P_{j+1}\Vert _1 } \right\}.

An efficient procedure was developed in [29] to efficiently simulate this quantity at $O(n)$ cost.

The comment to step (25) of Algorithm 7.14 also applies here.

next up previous contents index
Next: Storage Requirements and Floating Up: Block Lanczos Methods   Previous: Basic Algorithm   Contents   Index
Susan Blackford 2000-11-20