Next: Balancing Matrices T. Chen Up: Non-Hermitian Eigenvalue Problems Previous: Non-Hermitian Eigenvalue Problems Contents Index

Introduction

In this chapter we discuss the non-Hermitian eigenvalue problem (NHEP)

$\begin{displaymath} Ax = \lambda x, \end{displaymath}$

(117)

where the square matrix $A \neq A^{\ast}$ . $x \neq 0$ is called a right eigenvector. A vector $y \neq 0$ satisfying

$\begin{displaymath} y^{\ast} A = \lambda y^{\ast} \end{displaymath}$

(118)

is called a left eigenvector of

For a presentation of relevant theory for NHEPs, as well as pointers to literature, we refer to §2.5. In particular, we mention that the perturbation theory for these problems is delicate. One should keep in mind that a small norm of the residual $A\widehat{x}-\widehat{\lambda}\widehat{x}$ for a computed eigenpair $\widehat{\lambda}, \widehat{x}$ does not necessarily imply small errors in the computed values. For more information, see §2.5 and §7.13. Here we will give a brief introduction to those aspects that play a role in the selection of the algorithms. This will be followed by short characterizations of the different classes of methods, covered in this chapter.

In contrast to a Hermitian matrix, a non-Hermitian matrix does not have an orthogonal set of eigenvectors; in other words, a non-Hermitian matrix can in general not be transformed by an orthogonal matrix to diagonal form $D=Q^\ast A Q$ . Most non-Hermitian matrices can be transformed by a nonorthogonal to diagonal form $D=X^{-1}AX$ , but there exist matrices for which even this is not possible. Such matrices can be viewed as limit cases for which converges to a singular operator, and these matrices do not have a complete set of eigenvectors; they are called defective. This reveals a source for numerical instability: if is in some sense close to a singular operator, then the transformation may be expected to be sensitive to perturbations in . Such perturbations are introduced by the process for the computation of and . Therefore, it is of great importance to work with orthogonal, or close to orthogonal, transformations as often as possible. It is well known that for any non-Hermitian matrix there exists a unitary matrix that transforms it to upper triangular form

$\begin{displaymath} T=U^\ast A U. \end{displaymath}$

(119)

The matrix

is called the Schur form of

. The eigenvalues of

appear along the diagonal of

. Furthermore, the matrix

can be orthogonally transformed so that the eigenvalues of

occur in a prescribed order along the diagonal of

. If

is an eigenvector of

corresponding to the eigenvalue $\lambda$ , $Tz=\lambda z$ , it follows that

$\begin{displaymath} A(Uz)=\lambda (Uz), \end{displaymath}$

so that

is an eigenvector of

corresponding to the eigenvector $\lambda$ .

Reduction to Schur form can be done but it has its price. Let be the order of a non-Hermitian matrix . For , there is in general no finite algorithm that reduces to Schur form (if we exclude trivial cases, such as where is already upper triangular). The usual approach for matrices of modest order, say $n\le 1000$ , is to reduce it first to upper Hessenberg form which can be done in a finite number of steps. Subsequently this upper Hessenberg matrix is reduced in an iterative manner (QR iterations) to Schur form. See §7.3 for more details and for pointers to software.

The main problem with this standard approach is that the computational costs are proportional to (hence the limitation to matrices of order of a few thousands at most) and it requires storage proportional to elements. In general it is not possible to exploit sparsity of the matrix in order to reduce the storage requirements. For these reasons alternatives have been proposed. These alternatives are all completely iterative, that is, there is no finite direct reduction step. They are usually only suitable for the computation of a handful of interesting eigenpairs. We will now briefly summarize the iterative methods, to be presented in this chapter.

Single- and multiple-vector iterations,

§7.4. The largest eigenvalue in absolute value, and the corresponding eigenvector, of a larger matrix can be computed iteratively by the Power iteration. This is only advisable if this largest eigenvalue is significantly larger, in a relative sense, than the absolute values of the other eigenvalues. In order to create a matrix with well-separated largest eigenvalue, one usually works formally with the operator $(A - \sigma I)^{-1}$ for a user-defined $\sigma$ close to the desired eigenvalue. This is referred to as inverse iteration. The method has been generalized to a block method for a set of eigenvalues that is well separated from the remaining part of the spectrum. These methods, members of the family of single- and multiple-vector iterations, work well when the desired eigenvalues are well separated from the unwanted ones. Even then, however, the methods can hardly compete with more modern subspace methods and the only niche for their usage seems to be when one cannot afford storage for more than two iteration vectors.

Arnoldi methods,
§7.5, §7.6, and §7.7. Modern subspace iteration methods attempt to build a subspace that is rich in the eigenvectors that correspond to the wanted eigenvalues. In fact, they can be interpreted as methods that keep a number of vectors that are generated in the power method. With respect to this basis for a subspace of restricted dimension they usually reduce the matrix to a more manageable form. In the Arnoldi method one starts with an initial guess and generates an orthonormal basis for a Krylov subspace of dimension :

$\begin{displaymath} {\cal K}^m(A;v) \equiv \mbox{span}\{v, Av, A^2v, \ldots , A^{m-1}v \}. \end{displaymath}$

(As for the power method one may also use the shifted inverse operator $(A - \sigma I)^{-1}$ .)

The orthonormalization process leads to an by upper Hessenberg matrix (the approach is only practical for $m\ll n$ ) and this matrix can be interpreted as a suitably reduced form of the matrix restricted to the Krylov subspace. The eigenvalues of are approximations for some of the eigenvalues of and they can be computed by reducing to Schur form by the same standard method as discussed above for small dense matrices. An introduction to the Arnoldi method is given in §7.5. Clever restart techniques have been designed to keep memory requirements and computational overhead as low as possible: these techniques are known as implicitly restarted Arnoldi methods (IRAMs) and are discussed in §7.6. The computational overhead in the Arnoldi methods is not very suitable for parallelism, in particular not for some distributed memory computers. In order to create a larger ratio between computation and memory references block variants have been suggested. These block Arnoldi variants are discussed in §7.7.

Lanczos methods,
§7.8, §7.9, §7.10, and §7.11. The Arnoldi methods work with subspaces and one has to store the vectors that represent a basis for the current subspace. Moreover, adding a new vector to the basis involves the orthogonalization of this new vector with respect to all the vectors of the basis. For symmetric there is a simple three-term recurrence that generates a basis for the subspace, and if one is interested in the eigenvalues only, then only the last three basis vectors have to be stored. A variant of this Lanczos process is also available, but then one has to give up the orthogonality of the basis vectors. Hence economy in memory and computational overhead comes at the risk of instability. The original unsymmetric Lanczos method, or simply two-sided Lanczos, suffers also from (near) breakdowns. Despite these drawbacks, the method has received much attention and variants have been suggested that have been proven to be efficient and useful for relevant problems. The method, and several enhancements to it, have been described in §7.8, along with appropriate software.

A block version of the Lanczos method, and an implementation of it called ABLE, are presented in §7.9. ABLE is an adaptively blocked method designed to cure breakdown, and it attempts to eliminate the causes of slow convergence (such as a loss of biorthogonality, and clusters of nearby eigenvalues). Deflation within Krylov subspaces is nicely incorporated in yet another variant, the band Lanczos method, presented in §7.10. This is specially tailored for the reduced-order modeling of multiple-input and multiple-output linear dynamical systems of high dimensions.

A variant of the two-sided Lanczos method for complex symmetric systems is discussed in §7.11. With respect to the unsymmetric case, this variant reduces CPU time and computer storage by a factor of 2.

Jacobi-Davidson method,
§7.12. The Arnoldi and Lanczos methods are most effective when applied with shift-and-invert; that is, systems of the form $(A-\sigma I)x=b$ have to be solved efficiently and accurately. If neither of this is practical, but if some reasonable approximation for $x=(A-\sigma I)^{-1}b$ is cheaply available, then the Jacobi-Davidson methods can prove useful. These methods work with orthogonal transformations, just as the Arnoldi methods, but in the construction of an orthonormal basis for the involved subspace there is no attempt to create a special structure for the reduced matrix. Therefore, the computational overhead is larger than for Arnoldi. All the possible gain has to come from an effective and cheap approximation for the shift-and-invert step.

In Table 7.1 we present a summary of the various classes of methods. The table is not intended to replace further reading; it serves only as an indication of where to start and it may help the reader to follow a certain path in discovering which method serves which need best. The actual performance of each method depends, often critically, on many parameters, and one method that is best for one class of problems may encounter convergence problems for some particular matrices.

**Table 7.1:** Summary of algorithms for NHEPs
Algorithm	Mode	FL- $\lambda$	$\lambda$ -ext	$\lambda$ -int	Memory/overhead
Power		Slow	No	No	Low
	$(A - \sigma I)^{-1}$	Yes	No	No	Low
Subspace		Yes	Yes	No	Modest
	$(A - \sigma I)^{-1}$	Yes	Yes	Yes	Modest
Arnoldi		Yes	Yes	No	Modest
(IRAM)	$(A - \sigma I)^{-1}$	Yes	Yes	Yes	Modest
Lanczos		Yes	Yes	No	Low
	$(A - \sigma I)^{-1}$	Yes	Yes	Yes	Low
Block Lan.		Yes	Yes	No	Modest
	$(A - \sigma I)^{-1}$	Yes	Yes	Yes	Modest
Jac.-Dav.		Yes	Yes	Slow	Modest
	$(A - \sigma I)^{-1}$	Yes	Yes	Yes	Modest
	Precond	Yes	Yes	Yes	Modest

Explanation of Table 7.1

FL- $\lambda$ :: A few well-isolated largest (in absolute value) eigenvalues.
$\lambda$ -ext:: (A group of) eigenvalues at the exterior of the spectrum.
$\lambda$ -int:: (A group of) eigenvalues in the interior of the spectrum, near a specified target.
:: Operations with (shifted) .
$(A - \sigma I)^{-1}$ :: With shift-and-invert operations; that is, one needs an efficient solution mechanism for determining from $(A-\sigma I)x=b$ ( $\sigma$ and depend on the method).
Precond:: Operations with a preconditioner for $A-\sigma I$ ; that is, inexact solution of $(A-\sigma I)x=b$ is permitted, which gives possibilities for more efficiency (the success depends of course on the quality of the preconditioner).

**Table:** **Comparison of algorithms for non-Hermitian eigenvalue problems**
Algorithm	Mode	WIL	E-ext	E-int	Memory
Power it	Dir	$\square$
	SI	$\square$	$\square$	$\square$
Mult. vect	Dir
	SI		$\square$	$\square$
Arnoldi	Dir			$\square$	$\square$
(IRAM)	SI				$\square$
Lanczos	Dir
	SI
Block Lan	Dir
	SI
Jac Dav	Dir			$\square$	$\square/-$
	SI				$\square/-$
	Prec				$\square/-$

Next: Balancing Matrices T. Chen Up: Non-Hermitian Eigenvalue Problems Previous: Non-Hermitian Eigenvalue Problems Contents Index

Susan Blackford 2000-11-20