Chapter 17. The Problem Description File

The problem description file (PDF) is the mechanism through which NetSolve enables services for the user. The NetSolve distribution contains the source code for MA28, ITPACK, qsort, and a subset of BLAS and LAPACK routines. This software is contained in the $NETSOLVE_ROOT/src/SampleNumericalSoftware/ directory. Therefore, the default NetSolve enablement (contained in $NETSOLVE_ROOT/server_config) only accesses the PDFs related to the included software packages. The user should refer to the section called Expanding the Server Capabilities in Chapter 13 for details on expanding the capabilities of a server, and refer to the section called Contents of a Problem Description File for details on the structure of a problem description file.

Contents of a Problem Description File

In what follows we describe the contents of a problem description file (PDF). We offer all of the details because it may be necessary or desirable to be aware of them, but we strongly recommend the use of the GUI application described in the section called PDF Generator to create new PDFs.

The rationale for the syntax of the description files is explained in [ima]. Each description file is composed of several problem descriptions. Before explaining how to create a problem description, we reiterate the concept of objects in NetSolve, and then define the concept of mnemonics.

NetSolve Objects

As detailed in the section called NetSolve Problem Specification in Chapter 4, the syntax of a NetSolve problem specification is a function evaluation:
<output> = <name>(<input>)
where

  • <name> is a character string containing the name of the problem,

  • <input> is a list of input objects,

  • <output> is a list of output objects.

An object is itself described by an object type and a data type. The types available in the current version of NetSolve are shown in Table 17-1 and Table 17-2.

Table 17-1. Available data types

Data TypeDescriptionNote
NETSOLVE_IInteger 
NETSOLVE_CHARCharacter 
NETSOLVE_BYTEBytenever XDR encoded
NETSOLVE_FLOATSingle precision real 
NETSOLVE_DOUBLEDouble precision real 
NETSOLVE_SCOMPLEXSingle precision complex 
NETSOLVE_DCOMPLEXDouble precision complex 

Table 17-2. Available object types

Object TypeDescriptionNote
NETSOLVE_SCALARscalar 
NETSOLVE_VECTORvector 
NETSOLVE_MATRIXmatrix 
NETSOLVE_SPARSEMATRIXsparse matrixCompressed Row Storage (CRS) format
NETSOLVE_FILEfileonly of data type NETSOLVE_CHAR
NETSOLVE_PACKEDFILESpacked filesonly of data type NETSOLVE_CHAR
NETSOLVE_UPFUser Provided Functiononly of data type NETSOLVE_CHAR
NETSOLVE_STRINGCharacter stringonly of data type NETSOLVE_CHAR
NETSOLVE_STRINGLISTCharacter string listonly of data type NETSOLVE_CHAR

A problem description file (PDF) uses these objects to define a problem specification for a given service. the section called Mnemonics describes the requirements for each NetSolve object type as it relates to the problem description file.

Sparse Matrix Representation in NetSolve

NetSolve uses the Compressed Row Storage (CRS) for storing sparse matrices. The Compressed Row Storage (CRS) format puts the subsequent nonzeros of the matrix rows in contiguous memory locations. Assuming we have a nonsymmetric sparse matrix, we create vectors: one for floating-point numbers (val), and the other two for integers (col_ind, row_ptr). The val vector stores the values of the nonzero elements of the matrix, as they are traversed in a row-wise fashion. The col_ind vector stores the column indexes of the elements in the val vector. The row_ptr vector stores the locations in the val vector that start a row.

For example, if
         1 0 3 1
  A =    0 0 5 2
         6 1 0 8
         4 0 0 0

  then,

val:     1 3 1 5 2 6 1 8 4
col_ind: 0 2 3 2 3 0 1 3 0
row_ptr: 0 3 5 8 9

Thus, if a problem in NetSolve has the following specifications:
-- sm_prob --
* 1 object in INPUT
 - input 0: Sparse Matrix Double Precision Real.
 the sparse matrix
* Calling sequence from C or Fortran
11 arguments
 - Argument #0:
   - number of rows of input object #0 (sm)
   - number of columns of input object #0 (sm)
 - Argument #1:
   - number of non-zero values of input object #0 (sm)
 - Argument #2:
   - pointer to input object #0 (sm)
 - Argument #3:
   - column indices of non-zeros of input object #0 (sm)
 - Argument #4:
   - row pointers of the sparse matrix #0 (sm)
a Matlab user would call this program as:
  >> netsolve('sm_prob', SM);
where SM is a Matlab constructed sparse matrix object.

and a C user would invoke this problem as:
  double* val;
  int* col_index;
  int* row_ptr;

  int rows, num_nzeros;

  /* initialize the arrays and variables */
   ...
   ...
   ...

  status = netsl("sm_prob()", rows, num_nzeros, val, col_index, row_ptr);

Mnemonics

As described in the section called NetSolve Objects, the NetSolve system defines data structures that we call NetSolve objects. These are high-level objects that are comprised of integers, and arrays of characters and floats. To be able to relate high-level and low-level descriptions of the input and output objects of a given problem, we need to develop some kind of syntax. We decided to term this syntax mnemonics. A mnemonic is a character string (typically 2 or 3 characters long) that is used to access low level details of the different input and output objects. We index the list of objects, starting at 0. Therefore, the first object in input to a problem is the input object number 0 and the third object in output to a problem is the output object number 2, for instance. We use an I or an O to specify whether an object is in input or output. Here are the eight types of mnemonics for an object indexed x:

  • Pointer to the data : [I|O]x,

  • Number of rows : m[I|O]x (only for matrices, vectors, packed files and string lists),

  • Number of columns : n[I|O]x (only for matrices),

  • Leading dimensions : l[I|O]x (only for matrices).

  • Special descriptor : d[I|O]x (only for distributed memory objects).

  • Nonzero values of the sparse matrix: f[I|O]x

  • Row pointers for the sparse matrix: i[I|O]x

  • Column indices for the sparse matrix: p[I|O]x

For example, mI4 designates the number of rows of the input object number 4, whereas O1 designates the pointer to the data of output object number 1. In the next section, we describe the different sections that are necessary to build a problem description and will see how the mnemonics are used.

Sections of a Problem Description

The structure of a problem description file is very similar to that of a server configuration file. The lines starting with a '#' are considered comments. Keywords are prefixed by a '@' and mark the beginning of sub-sections. In what follows, we describe each section separately as well as each keyword and sub-sections within each section. Keep in mind to look at one existing problem description file as a template when reading this section.

Problem ID and General Information

The following keywords are required and must occur in the order in which they are presented.

  • '@PROBLEM <nickname>' specifies the name of a problem as it will be visible to the NetSolve users (clients).

  • '@INCLUDE <name>' specifies a C header file to include (See the example in the section called A Simple Example). There can be several such lines as a problem can call several functions.

  • '@DASHI <path>' specifies a default directory in which header files are to be looked for, in a similar way as the -I option of most C compilers. There can be several such lines as a problem can call several functions.

  • '@LIB <name>' specifies a library or an object file to link to, or a -L option for the linker (See the example in the section called A Simple Example). If multiple libraries are required, a separate @LIB line must be specified for each library, and the libraries will be linked in the order in which they are specified. The @LIB line(s) can contain variable name substitutions such as $(NETSOLVE_ROOT).

  • '@FUNCTION <name>' specifies the name of a function from the underlying numerical software library that is being called to solve the problem. There can be several such lines as a problem can call several functions.

  • '@LANGUAGE [C|FORTRAN]' specifies whether the underlying numerical library is written in C or in Fortran. This is used in conjunction with the function names specified with '@FUNCTION' to handle multi-language interoperability.

  • '@MAJOR [COL|ROW]' specifies what major should be used to store the input matrices before calling the underlying numerical software. For instance, if the numerical library is LAPACK [lapack], the major must be 'COL'.

  • '@PATH <path>' specifies a path-like name for the problems. This path is only a naming convention and is used for presentation purposes.

  • '@DESCRIPTION' marks the beginning of the textual description of the problem. This sub-section is mandatory as it is used by the NetSolve management tools to provide information to the NetSolve users (clients) about a specific problem.

Input Specification

  • '@INPUT <number>' specifies the number of objects in input to the problem. This line is followed by that corresponding <number> of object descriptions (see below).

  • '@OBJECT <object type> <data type> <name>' specifies an object type, data type, and name. The name is only used for presentation purposes. This line is followed by a mandatory textual description of the object. The data types are abbreviated by replacing NETSOLVE_I by I, NETSOLVE_CHAR by CHAR, NETSOLVE_BYTE by B, NETSOLVE_FLOAT by S, NETSOLVE_DOUBLE by D, NETSOLVE_SCOMPLEX by C, and NETSOLVE_DCOMPLEX by Z, (see Table 17-1). Similarly, the object types are abbreviated by replacing NETSOLVE_SCALAR by SCALAR, NETSOLVE_VECTOR by VECTOR, NETSOLVE_MATRIX by MATRIX, NETSOLVE_SPARSEMATRIX by SPARSEMATRIX, NETSOLVE_FILE by FILE, NETSOLVE_PACKEDFILES by PACKEDFILES, NETSOLVE_UPF by UPF, NETSOLVE_STRING by STRING, and NETSOLVE_STRINGLIST by STRINGLIST, (see Table 17-2). The objects of object type FILE, STRING, UPF, and PACKEDFILES do not have a data type. Here are a few examples:
    @OBJECT VECTOR I X
    An integer vector named 'X'
    
    @OBJECT MATRIX D A
    A double precision real matrix named 'A'
    
    @OBJECT FILE foo
    A file named 'foo'

Output Specification

  • '@OUTPUT <number>' specifies the number of objects in output from the problem. This line is followed by that corresponding <number> of object descriptions (see below).

  • '@OBJECT <object type> <data type> <name>' specifies an object type, a data type and a name. This line is followed by a mandatory textual description of the object. The abbreviations for data types and object types are as defined previously in the section called Input Specification.

Additional Information

The following list of tags are optional.

  • '@MATLAB_MERGE <number1>,<number2>' specifies that the output objects number <number1> and <number2> can be merged as a complex object upon receipt of the numerical results from the Matlab client interface (see Chapter 6).

  • '@COMPLEXITY <number1>,<number2>' specifies that given the size of the problem, say n, the asymptotic complexity, say C, of the problem in number of floating point operations is
     C = number1 * n^(number2)

  • '@CUSTOMIZED <name>' is an internal customization used by the code developers. It means that the NetSolve server code will do something different (or custom) before invoking a routine. For example, this option is used for the enablement of ScaLAPACK and the sparse solvers. The functionality of this keyword will be expanded in the future. Novice users are advised to avoid using this keyword.

  • '@PARALLEL MPI' specifies that the software enabled in the problem description file is parallel and uses MPI. Thus, MPI must be installed on the server to which you are enabling this service.

Calling Sequence

The calling sequence to the problem must be defined so that the NetSolve client using the C or Fortran interfaces can call the problem. The material described in this section is ignored by NetSolve when the client is Matlab, Mathematica or Java. To clarify, let us take an example. Let us say that the problem 'toto' takes a matrix in input and returns a matrix in output. The call from the Matlab interface looks like:
	>> [b] = netsolve('toto',a)
for instance. However, there can be several possible calling sequences from C or Fortran. Assuming the following declarations in Fortran:
        DOUBLE PRECISION A(M,N)
        DOUBLE PRECISION B(K,L)
the following calling sequences are all possible:
        CALL FNETSL('toto()',A,B,M,N,K,L)
        CALL FNETSL('toto()',A,M,N,B,K,L)
        CALL FNETSL('toto()',M,N,A,K,L,B)
        etc.....
The Calling Sequence sub-section in the problem description specifies the order of the arguments (represented with mnemonics) in the C and Fortran interface calling sequence. Indeed, still with the same example, the integer N can be represented by the mnemonic nI0, and the pointer B can be represented by the mnemonic O0.

It is very important to note that the number of rows or columns or the leading dimension of input and output arguments must be specified in the @CALLINGSEQUENCE sub-section. If a dimension is not passed as an input argument, or equivalenced with an existing input argument (via @ARG), it must be set/computed using @COMP.

  • '@CALLINGSEQUENCE' marks the beginning of a calling sequence description. This description consists of a list of argument specifications (see below).

  • '@ARG <comma-separated list of mnemonics>' specifies an argument of the calling sequence. For instance the line
              @ARG I0
    specifies that the current argument in the calling sequence is the pointer to the data of the first object in input. The line
              @ARG mI0,lI0
    specifies that the current argument in the calling sequence is the number of rows and the leading dimension of the first object in input (which in this case is a matrix). The line
              @ARG ?
    specifies that the current argument in the calling sequence should be ignored by NetSolve (useful in some cases). Note that no argument description contains mnemonics of the form [m|n]O*.

  • '@CONST <mnemonic>=<number>' specifies that the number of rows or columns or the leading dimension of an input object is constant and can not be found in the calling sequence. For instance, the line
              @CONST mI4=12
    means that the number of rows of the fifth object in input is always 12 and is not passed in by the NetSolve user.

  • '@COMP <mnemonic>=<expression>' specifies that the number of rows or columns or the leading dimension of an input object has not been supplied as an argument in the calling sequence, but can be computed using arguments in the calling sequence.

    Here are some examples:
    @COMP mI1=mI0
    @COMP mI0=op(+,mI3,1)   // performs an addition
    @COMP mI3=array(I2,0)   // performs an indirection
    @COMP mI1=op(-,array(I0,op(-,mI0,1)),1)
    @COMP mI2=op(+,op(+,array(I1,0),1),op(*,array(I0,0),2))
    @COMP mI2=if(array(I0,0)='N',mI1,if(array(I0,0)='T',nI1,op(-,0,1)))
                               // conditionals
    where the op notation is used to perform addition and subtraction, and the array notation is used to access the value of a specific element of an array. For example, mI3 is equal to the value of the zero-th element of the array I2.

    This feature of NetSolve is rarely used, and is only necessary in routines when the user's array storage differs from the array storage passed to the computational routine. A good example of such an occurrence is in the interfaces to the LAPACK routines for band and tridiagonal matrices.

Pseudo-Code

  • '@CODE' marks the beginning of the pseudo-code section.

  • '@END_CODE' marks the end of the pseudo-code section.

The pseudo-code is C code that uses the mnemonics described in the section called Mnemonics. This code contains call(s) to the numerical library function(s) that the problem is supposed to use as part of its algorithm. The arguments in the calling sequences of these library routines will be primarily the different mnemonics. In the pseudo-code, the mnemonics are pre- and ap-pended by a '@' to facilitate the parsing. Let us review again the meaning of some possible mnemonics in the pseudo-code:

  • '@I0@': pointer to the elements of the first object in input.

  • '@mI0@': pointer to an integer that is number of rows of the first object in input.

  • '@nO1@': pointer to an integer that is number of columns of the second object in output.

Usually, the pseudo-code is organized in three parts. First, the preparation of the input (if necessary). Second, the call to the numerical library function(s). Third, the update of the output (pointer and sizes). At this point, it is best to give an example. Let us assume that we have access to a hypothetical numerical C library that possesses a function matvec() that performs a matrix-vector multiply for square matrices. The prototype of the function is
void matvec(float *a, float *b, int n, int l);
where a is a pointer to the matrix, b is a pointer to the vector, n is the dimension of the matrix, l is the leading dimension of the matrix and the result is stored in b (overwriting the input). We may define the problem such that the matrix is the first object in the input, the vector the second object in the input, and the result the only object in output. Possible preparations could be for instance the creation of workspace, test of input values to detect mistakes, test of matching dimensions. In this case, we may want to check that the dimension of vector b agrees with the number of columns of matrix a. This can be done as follows:
@CODE
if (*@mI1@ != *@nI0@)
  return NS_PROT_DIM_MISMATCH;
The macro NS_PROT_DIM_MISMATCH is defined by NetSolve. Other macros available are NS_PROT_BAD_VALUES (for invalid input parameters), NS_PROT_INTERNAL_FAILURE (for a malfunction of the numerical software) or NS_PROT_NO_SOLUTION (sometimes useful if no numerical solution has been found and the client is interactive). Notice the use of '*' for accessing the integers at addresses @mI1@ and @nI0@.

The second part of the pseudo-code consists of calling the function matvec and is:
matvec(@I0@,@I1@,*@mI0@,*@mI0@);
A few things can be said on this call. First, we use the '*' to access integers via the pointers. Note that if matvec() were a Fortran subroutine, we would pass the addresses themselves (see Example below). Second, the leading dimension is taken to be equal to the dimension. This code is executed at the server level where the matrix (or sub-matrix) has been received from the client over the network. As such, it has been stored contiguously in memory and has a leading dimension equal to its number of rows. As a general rule, the mnemonics @l[I|O]*@ never appear in the pseudo-code. The last thing to do at this point is to update the output:
@O0@ = @I1@;
*@mO0@ = *@mI1@;
@END_CODE
The first line expresses the fact that the input has been overwritten by the output. The second line sets the number of rows of the output. The following section gives a complete example, with all of the sections of the problem description.

A Simple Example

Let us imagine that we have access to a Fortran numerical library that contains a function, say LINSOL, to solve a linear system according to the following prototype:
SUBROUTINE LINSOL( A, B, N, NRHS, LDA, LDB )

DOUBLE PRECISION A( LDA, * )  // Left-hand side (NxN)
DOUBLE PRECISION B( LDB, * )  // Right-hand side (NxNRHS),
                             // overwritten with the solution
INTEGER N
INTEGER NRHS
INTEGER LDA               // Leading Dimension of A
INTEGER LDB               // Leading Dimension of B
Then, an appropriate description for a problem that solves a linear system using LINSOL and that expects from the client the same calling sequence as the one for LINSOL is:
@PROBLEM linsol
@INCLUDE <math.h>
@INCLUDE "/home/me/my_header.h"
@LIB -L/home/lib/
@LIB -lstuff
@LIB /home/me/lib_$(NETSOLVE_ARCH).a
@LIB /home/stuff/add.o
@FUNCTION linsol
@LANGUAGE FORTRAN
@MAJOR COL
@PATH    LinearAlgebra/LinearSystems/
@DESCRIPTION
Solves the square linear system A*X = B. Where:
 A is a double-precision matrix of dimension NxN
 B is a double-precision matrix of dimension NxNRHS
 X is the solution
@INPUT 2
@OBJECT MATRIX D A 
Matrix A (NxN)
@OBJECT MATRIX D B
Matrix B (NxNRHS)
@OUTPUT 1
@OBJECT MATRIX D X
Solution X (NxNRHS)
@COMPLEXITY 3,3
@CALLINGSEQUENCE 
@ARG I0
@ARG I1,O0
@ARG nI0,mI0,mI1
@ARG nI1
@ARG lI0
@ARG lI1,lO0
@CODE

linsol(@I0@,@I1@,@mI0@,@nI1@,@lI0@,@lI1@);

@O0@ =@I1@;       /* Pointing to the overwritten input */
*@mO0@ = *@mI1@;  /* Setting the number of rows        */
*@nO0@ = *@nI1@;  /* Setting the number of columns     */

@END_CODE

PDF Generator

The process of creating new problem descriptions can be difficult, especially for a first time user. It is true that after writing a few files, it becomes rather routine and several NetSolve users have already generated a good number of working PDFs for a variety of purposes (including linear algebra, optimization, image processing, etc.). However, we have designed a graphical Java GUI application that helps users in creating PDFs. To compile this GUI, type
UNIX> make pdgui
from the $NETSOLVE_ROOT directory. This creates a set of Java classfiles needed to run the GUI application and places them in the $NETSOLVE_ROOT/bin/$NETSOLVE_ARCH directory. After this compilation, you can also find a shell script named NS_pdgui that can be used from any directory to properly run the GUI application which needs to locate the abovementioned classfiles. This GUI can be used to create and load PDFs into NetSolve. Apart from being easy to use, the GUI also has a help menu (not implemented yet) and we defer other details about running the GUI to those help files. The user has the option of storing PDFs in nspdf format or both nspdf format and xmlpdf format. The user can only load a PDF if it has been stored in xmlpdf format. As the user has the option of storing in xmlpdf format, there is no need to keep the GUI open until he gets the pdf correct. He must make sure that he has stored the created pdf in xmlpdf format before closing the GUI.