Next: Timing the Level 2 Up: More About Timing Previous: More About Timing   Contents

## The Linear Equation Timing Program

The timing program for the linear equation routines is driven by a data file from which the following parameters may be varied:

• M, the matrix row dimension
• N, the matrix column dimension, or the half-bandwidth for the band routines
• K, the number of right-hand sides for the linear solvers, or the third dimension for the Level 3 BLAS
• NB, the block size for the blocked routines, or INCX for the Level 2 BLAS
• NX, the crossover point, the point in a block algorithm at which we switch to an unblocked algorithm
• LDA, the leading dimension of the dense and banded matrices.

For banded matrices, the values of M are used for the matrix row and column dimensions, and for symmetric or Hermitian matrices that are not banded, the values of N are used for the matrix dimension.

The number and size of the input values are limited by certain program maximums which are defined in PARAMETER statements in the main timing program:

549#549

The parameter LDAMAX should be at least NMAX. For the xGB path, we must have 550#550, where 551#551, which restricts the value of K. These limits allow K to be as big as 200 for M = 1000. For the xPB and xTB paths, the condition is 552#552.

The input file also specifies a set of LAPACK routine names or LAPACK path names to be timed. The path names are similar to those used for the test program, and include the following standard paths:


{S, C, D, Z} 		  GE 		  General matrices (LU factorization)
{S, C, D, Z} 		  GB 		  General banded matrices
{S, C, D, Z} 		  PO 		  Positive definite matrices (Cholesky factorization)
{S, C, D, Z} 		  PP 		  Positive definite packed
{S, C, D, Z} 		  PB 		  Positive definite banded
{S, C, D, Z} 		  SY 		  Symmetric indefinite matrices (Bunch-Kaufman factorization)
{S, C, D, Z} 		  SP 		  Symmetric indefinite packed
{C, Z} 		  HE 		  Hermitian indefinite matrices (Bunch-Kaufman factorization)
{C, Z} 		  HP 		  Hermitian indefinite packed
{S, C, D, Z} 		  TR 		  Triangular matrices
{S, C, D, Z} 		  TP 		  Triangular packed matrices
{S, C, D, Z} 		  TB 		  Triangular band
{S, C, D, Z} 		  QR 		  QR decomposition
{S, C, D, Z} 		  RQ 		  RQ decomposition
{S, C, D, Z} 		  LQ 		  LQ decomposition
{S, C, D, Z} 		  QL 		  QL decomposition
{S, C, D, Z} 		  QP 		  QR decomposition with column pivoting
{S, C, D, Z} 		  HR 		 Reduction to Hessenberg form
{S, C, D, Z} 		  TD 		 Reduction to real tridiagonal form
{S, C, D, Z} 		  BR 		 Reduction to bidiagonal form
{S, C, D, Z} 		  LS 		 Least Squares


For timing the Level 2 and 3 BLAS, two extra paths are provided:


{S, C, D, Z} 		  B2 		  Level 2 BLAS
{S, C, D, Z} 		  B3 		  Level 3 BLAS


The paths xGT, xPT, xHR and xTD include timing of the equivalent LINPACK solvers or EISPACK reductions for comparison.

The timing programs have their own matrix generator that supplies random Toeplitz matrices (constant along a diagonal) for many of the timing paths. Toeplitz matrices are used because they can be generated more quickly than dense matrices, and the call to the matrix generator is inside the timing loop. The LAPACK test matrix generator is used to generate matrices of known condition for the xQR, xRQ, xLQ, xQL, xQP, xHR, xTD, and xBR paths.

The user specifies a minimum time for which each routine should run and the computation is repeated if necessary until this time is used. In order to prevent inflated performance due to a matrix remaining in the cache from one iteration to the next, the paths that use random Toeplitz matrices regenerate the matrix before each call to the LAPACK routine in the timing loop. The time for generating the matrix at each iteration is subtracted from the total time.

An annotated example of an input file for timing the REAL linear equation routines that operate on dense square matrices is shown below. The first line of input is printed as the first line of output and can be used to identify different sets of results.

LAPACK timing, REAL square matrices
5                                Number of values of M
10 20 40 60 80                   Values of M (row dimension)
5                                Number of values of N
10 20 40 60 80                   Values of N (column dimension)
2                                Number of values of K
20 80                            Values of K
2                                Number of values of NB
1  8                             Values of NB (blocksize)
0  8                             Values of NX (crossover point)
1                                Number of values of LDA
81                               Values of LDA (leading dimension)
0.05                             Minimum time in seconds
SGE    T T T
SPO    T T T
SPP    T T T
SSY    T T T
SSP    T T T
STR    T T
STP    T T
SQR    T T T
SLQ    T T T
SQL    T T T
SRQ    T T T
SQP    T
SHR    T T T T
STD    T T T T
SBR    T T T
SLS    T T T T T T

The first 13 lines of the input file are read using list-directed input and are used to specify the values of M, N, K, NB, NX, LDA, and TIMMIN (the minimum time). By default, xGEMV and xGEMM are called to sample the BLAS performance on square matrices of order N, but this option can be controlled by entering one of the following on line 14:


BAND 		 Time xGBMV (instead of xGEMV) using matrices of order M and

bandwidth K, and time xGEMM using matrices of order K.


NONE 		 Do not do the sample timing of xGEMV and xGEMM.

The timing paths or routine names which follow may be specified in any order.

When timing the band routines it is more interesting to use one large value of the matrix size and vary the bandwidth. An annotated example of an input file for timing the REAL linear equation routines that operate on banded matrices is shown below.

LAPACK timing, REAL band matrices
1                                Number of values of M
200                              Values of M (row dimension)
5                                Number of values of K
10 20 30 40 50                   Values of K (bandwidth)
4                                Number of values of NRHS
1 2 16 100                       Values of NRHS (the number of right-hand sides)
2                                Number of values of NB
1  8                             Values of NB (blocksize)
0  8                             Values of NX (crossover point)
1                                Number of values of LDA
152                              Values of LDA (leading dimension)
0.05                             Minimum time in seconds
BAND                             Time sample banded BLAS
SGB
SPB
STB


Here M specifies the matrix size and K specifies the bandwidth for the test paths SGB, SPB, and STB. Note that we request timing of the sample BLAS for banded matrices by specifying BAND'' on line 13.

We also provide a separate input file for timing the orthogonal factorization and reduction routines that operate on rectangular matrices. For these routines, the values of 553#553 and 13#13 are specified in ordered pairs 554#554. An annotated example of an input file for timing the REAL linear equation routines that operate on dense rectangular matrices is shown below. The input file is read in the same way as the one for dense square matrices.

LAPACK timing, REAL rectangular matrices
7                                Number of values of M
20 40 20 40 80 40 80             Values of M (row dimension)
7                                Number of values of N
20 20 40 40 40 80 80             Values of N (column dimension)
4                                Number of values of K
1 2 16 100                       Values of K
2                                Number of values of NB
1  8                             Values of NB (blocksize)
0  8                             Values of NX (crossover point)
1                                Number of values of LDA
81                               Values of LDA (leading dimension)
0.05                             Minimum time in seconds
none
SQR    T T T
SLQ    T T T
SQL    T T T
SRQ    T T T
SQP    T
SBR    T T F


Next: Timing the Level 2 Up: More About Timing Previous: More About Timing   Contents
Susan Blackford 2001-08-13