For banded matrices, the values of M are used for the matrix row and column dimensions, and for symmetric or Hermitian matrices that are not banded, the values of N are used for the matrix dimension.
The number and size of the input values are limited by certain program maximums which are defined in PARAMETER statements in the main timing program:
549#549
The parameter LDAMAX should be at least NMAX.
For the xGB path, we must have
550#550, where
551#551,
which restricts the value of K.
These limits allow K to be as big as 200 for M = 1000.
For the xPB and xTB paths, the condition is
552#552.
The input file also specifies a set of LAPACK routine names or LAPACK path names to be timed. The path names are similar to those used for the test program, and include the following standard paths:
{S, C, D, Z} GE General matrices (LU factorization)
{S, C, D, Z} GB General banded matrices
{S, C, D, Z} PO Positive definite matrices (Cholesky factorization)
{S, C, D, Z} PP Positive definite packed
{S, C, D, Z} PB Positive definite banded
{S, C, D, Z} SY Symmetric indefinite matrices (Bunch-Kaufman factorization)
{S, C, D, Z} SP Symmetric indefinite packed
{C, Z} HE Hermitian indefinite matrices (Bunch-Kaufman factorization)
{C, Z} HP Hermitian indefinite packed
{S, C, D, Z} TR Triangular matrices
{S, C, D, Z} TP Triangular packed matrices
{S, C, D, Z} TB Triangular band
{S, C, D, Z} QR QR decomposition
{S, C, D, Z} RQ RQ decomposition
{S, C, D, Z} LQ LQ decomposition
{S, C, D, Z} QL QL decomposition
{S, C, D, Z} QP QR decomposition with column pivoting
{S, C, D, Z} HR Reduction to Hessenberg form
{S, C, D, Z} TD Reduction to real tridiagonal form
{S, C, D, Z} BR Reduction to bidiagonal form
{S, C, D, Z} LS Least Squares
For timing the Level 2 and 3 BLAS, two extra paths are provided:
{S, C, D, Z} B2 Level 2 BLAS
{S, C, D, Z} B3 Level 3 BLAS
The paths xGT, xPT, xHR and xTD include timing of the equivalent LINPACK solvers or EISPACK reductions for comparison.
The timing programs have their own matrix generator that supplies random Toeplitz matrices (constant along a diagonal) for many of the timing paths. Toeplitz matrices are used because they can be generated more quickly than dense matrices, and the call to the matrix generator is inside the timing loop. The LAPACK test matrix generator is used to generate matrices of known condition for the xQR, xRQ, xLQ, xQL, xQP, xHR, xTD, and xBR paths.
The user specifies a minimum time for which each routine should run and the computation is repeated if necessary until this time is used. In order to prevent inflated performance due to a matrix remaining in the cache from one iteration to the next, the paths that use random Toeplitz matrices regenerate the matrix before each call to the LAPACK routine in the timing loop. The time for generating the matrix at each iteration is subtracted from the total time.
An annotated example of an input file for timing the REAL linear equation routines that operate on dense square matrices is shown below. The first line of input is printed as the first line of output and can be used to identify different sets of results.
LAPACK timing, REAL square matrices 5 Number of values of M 10 20 40 60 80 Values of M (row dimension) 5 Number of values of N 10 20 40 60 80 Values of N (column dimension) 2 Number of values of K 20 80 Values of K 2 Number of values of NB 1 8 Values of NB (blocksize) 0 8 Values of NX (crossover point) 1 Number of values of LDA 81 Values of LDA (leading dimension) 0.05 Minimum time in seconds SGE T T T SPO T T T SPP T T T SSY T T T SSP T T T STR T T STP T T SQR T T T SLQ T T T SQL T T T SRQ T T T SQP T SHR T T T T STD T T T T SBR T T T SLS T T T T T TThe first 13 lines of the input file are read using list-directed input and are used to specify the values of M, N, K, NB, NX, LDA, and TIMMIN (the minimum time). By default, xGEMV and xGEMM are called to sample the BLAS performance on square matrices of order N, but this option can be controlled by entering one of the following on line 14:
BAND Time xGBMV (instead of xGEMV) using matrices of order M and
bandwidth K, and time xGEMM using matrices of order K.
NONE Do not do the sample timing of xGEMV and xGEMM.The timing paths or routine names which follow may be specified in any order.
When timing the band routines it is more interesting to use one large value of the matrix size and vary the bandwidth. An annotated example of an input file for timing the REAL linear equation routines that operate on banded matrices is shown below.
LAPACK timing, REAL band matrices 1 Number of values of M 200 Values of M (row dimension) 5 Number of values of K 10 20 30 40 50 Values of K (bandwidth) 4 Number of values of NRHS 1 2 16 100 Values of NRHS (the number of right-hand sides) 2 Number of values of NB 1 8 Values of NB (blocksize) 0 8 Values of NX (crossover point) 1 Number of values of LDA 152 Values of LDA (leading dimension) 0.05 Minimum time in seconds BAND Time sample banded BLAS SGB SPB STB
Here M specifies the matrix size and K specifies the bandwidth for the test paths SGB, SPB, and STB. Note that we request timing of the sample BLAS for banded matrices by specifying ``BAND'' on line 13.
We also provide a separate input file for timing the orthogonal factorization and reduction routines that operate on rectangular matrices. For these routines, the values of 553#553 and 13#13 are specified in ordered pairs 554#554. An annotated example of an input file for timing the REAL linear equation routines that operate on dense rectangular matrices is shown below. The input file is read in the same way as the one for dense square matrices.
LAPACK timing, REAL rectangular matrices 7 Number of values of M 20 40 20 40 80 40 80 Values of M (row dimension) 7 Number of values of N 20 20 40 40 40 80 80 Values of N (column dimension) 4 Number of values of K 1 2 16 100 Values of K 2 Number of values of NB 1 8 Values of NB (blocksize) 0 8 Values of NX (crossover point) 1 Number of values of LDA 81 Values of LDA (leading dimension) 0.05 Minimum time in seconds none SQR T T T SLQ T T T SQL T T T SRQ T T T SQP T SBR T T F