Download the tar-gzipped file,
issue then "gunzip hpl-2.1.tar.gz; tar -xvf hpl-2.1.tar" and this
should create an hpl-2.1 directory containing the distribution.
We call this directory the top level directory.
Create a file Make.<arch> in the top-level directory.
For this purpose, you may want to re-use one contained in the
setup directory. This Make.<arch> file essentially contains
the compilers, libraries, and their paths to be used on your system.
Type "make arch=<arch>". This should create an executable
in the bin/<arch> directory called xhpl. For example, on our
Linux PII cluster, I create a file called Make.Linux_PII in the
top-level directory. Then, I type "make arch=Linux_PII". This
creates the executable file bin/Linux_PII/xhpl.
Quick check: run a few tests (assuming you have 4 nodes for
interactive use) by issuing the following commands from the top
level directory: "cd bin/<arch> ; mpirun -np 4 xhpl". This
should produce quite a bit of meaningful output on the screen.
Most of the performance parameters can be tuned, by modifying
the input file bin/<arch>/HPL.dat. See the
tuning page or the TUNING file in the
Compile Time Options
At the end of the "model" Make.<arch>, the user is given
the opportunity to override some default compile options of this
software. The list of these options and their meaning is:
force the copy of the panel L before bcast
call the BLAS C interface
call the vsip library
enable detailed timers
The user must choose between either the BLAS Fortran 77 interface,
or the BLAS C interface, or the VSIPL library depending on which
computational kernels are available on his system. Only one of these
options should be selected. If you choose the BLAS Fortran 77
interface, it is necessary to fill out the machine-specific C to
Fortran 77 interface section of the Make.<arch> file. To do
this, please refer to the Make.<arch> examples contained in
the setup directory.
By default HPL will:
not copy L before broadcast,
call the BLAS Fortran 77 interface,
not display detailed timing information.
As an example, suppose one wants this software to copy the panel of
columns into a contiguous buffer before broadcasting. It should
be more efficient to let the software create the appropriate MPI
user-defined data type since this may avoid the data copy. So, it
is a strange idea, but one insists. To achieve this one would add
-DHPL_COPY_L to the definition of HPL_OPTS at the end of the file
Make.<arch>. Issue then a "make clean arch=<arch> ;
make build arch=<arch>" and the executable will be re-build
with that feature in.