next up previous contents index
Next: Matrix Storage Schemes Up: Documentation and Software Conventions Previous: Error Handling and the   Contents   Index

Determining the Block Size for Block Algorithms

LAPACK routines that implement block algorithms need to determine what block size to use. The intention behind the design of LAPACK is that the choice of block size should be hidden from users as much as possible, but at the same time easily accessible to installers of the package when tuning LAPACK for a particular machine.

LAPACK routines call an auxiliary enquiry function ILAENV, which returns the optimal block size to be used, as well as other parameters. The version of ILAENV supplied with the package contains default values that led to good behavior over a reasonable number of our test machines, but to achieve optimal performance, it may be beneficial to tune ILAENV for your particular machine environment. Ideally a distinct implementation of ILAENV is needed for each machine environment (see also Chapter 6). The optimal block size may also depend on the routine, the combination of option arguments (if any), and the problem dimensions.

If ILAENV returns a block size of 1, then the routine performs the unblocked algorithm, calling Level 2 BLAS, and makes no calls to Level 3 BLAS.

Some LAPACK routines require a work array whose size is proportional to the block size (see subsection 5.1.7). The actual length of the work array is supplied as an argument LWORK. The description of the arguments WORK and LWORK typically goes as follows:

(workspace) REAL array, dimension (LWORK)
On exit, if INFO = 0, then WORK(1) returns the optimal LWORK.

(input) INTEGER
The dimension of the array WORK. LWORK $\geq$ max(1,N).
For optimal performance LWORK $\geq$ N*NB, where NB is the optimal block size returned by ILAENV.

The routine determines the block size to be used by the following steps:

the optimal block size is determined by calling ILAENV;

if the value of LWORK indicates that enough workspace has been supplied, the routine uses the optimal block size;

otherwise, the routine determines the largest block size that can be used with the supplied amount of workspace;

if this new block size does not fall below a threshold value (also returned by ILAENV), the routine uses the new value;

otherwise, the routine uses the unblocked algorithm.

The minimum value of LWORK that would be needed to use the optimal block size, is returned in WORK(1).

Thus, the routine uses the largest block size allowed by the amount of workspace supplied, as long as this is likely to give better performance than the unblocked algorithm. WORK(1) is not always a simple formula in terms of N and NB.

The specification of LWORK gives the minimum value for the routine to return correct results. If the supplied value is less than the minimum -- indicating that there is insufficient workspace to perform the unblocked algorithm -- the value of LWORK is regarded as an illegal value, and is treated like any other illegal argument value (see subsection 5.1.9).

If in doubt about how much workspace to supply, users should supply a generous amount (assume a block size of 64, say), and then examine the value of WORK(1) on exit.

next up previous contents index
Next: Matrix Storage Schemes Up: Documentation and Software Conventions Previous: Error Handling and the   Contents   Index
Susan Blackford