October 8-9, 1998

NIST, Washington, D.C.

October 8, 1998

Jack opened the meeting at 9am and we began by addressing fundamental questions about the structure of the document, as well as stressing the importance of reference implementations for all proposed routines. These fundamental questions were:

We began with a cursory discussion of the points #3 and #4. We need a consistent representation of matrices for all chapters. As it stands, that is not the case. And, the user level sparse BLAS proposal is matrix-type neutral. In particular, dense and banded matrices are a special case of this. Should the BLAS support these generic interfaces across the board?

Concerning point #1, the majority of the BLAST attendees felt that the document is too long. It was then asked if we want to reduce the functionality? Yes, however, we want the Legacy BLAS to be a subset of this new proposal. The options to support were:

Several of the vendors felt that the users would prefer the first option (a). However, the Forum attendees prefer the second option (b).

It was then suggested that maybe we should only do a limited set of bindings? Only F90 and C? Only C and Fortran77?

Roldan Pozo of NIST took the floor to discuss an idea for a condensed LIS/LDS proposal. An example of this proposal was to be drawn up during lunch break.

A straw vote was then proposed on whether to remove Appendix B2 (Extensions)? and move these routines into Chapter2? If we don't have F77 language support in the new routines, then we would need to keep this Extensions section.

A straw vote was then taken on which languages to support in the document

Considering time constraints for the Forum and the possible changes to the Java language, a straw vote was then taken to not include C++ and Java language bindings in this document -- passed 12/0/0. These bindings could be discussed at a future Forum.

It was then suggested that we should take binding formal votes on these issues. Ten eligible voters were present -- UT, NAG, HP/Convex, Tera, NEC, Berkeley, AT&T, Florida, NIST, and Intel. One additional eligible voter appeared later in the meeting -- University of Houston.

Formal votes were then taken on:

What do we do about LIS? This issue was postponed until after lunch.

Now vote on data representation? Should it be set in the standard or different for each binding? Fixed data structure or separate possible data structures for each language? This issue was postponed until after lunch.

At 12Noon, break for lunch.

After lunch, (1pm-2pm) work began on deriving this new condensed LIS/LDS combined representation.

Roldan then presented the new structure at 2pm. And since the f95 interface differs significantly, should or should it not be listed with the C and Fortran77 interfaces?

The pros and cons of row/column storage allowance for Fortran77 were then discussed. Should an ORDER parameter be mandated? A formal vote -- failed 1/9/0.

It was then proposed to put all 3 bindings together and then have separate notes for each with cross references. We should allow the op(A) and op(B) notation, and the ordering of bindings shall be F95, F77, C. Formal votes were taken:

The motivation for all of these options was to make the document shorter.

And now we were back to the discussion of the functionality in the tables, and we began with discussing Table 2.1 (Reduction Operations). Votes were taken on the following:

A brief break at 3:50pm.

Discussion now began on Table 2.2 (Vector Operations).

Discussion now began on Table 2.3:

This process was then postponed until the next day, and the issue of data representation was again addressed. Should we support two interfaces -- high level and generic interface? A handle would be passed as an argument to every routine. This object-based approach would greatly reduce the number of routines.

Clint Whaley of UT then took the floor about data representation of matrices in C. The pros and cons of

were then discussed briefly. It was stressed that this issue must be voted on before this meeting is concluded.

The meeting adjourned for the day at 6:00pm.

October 9, 1998

Jack opened the meeting at 9:00am and began discussing the planning for the next meeting (3 days in duration) for either December 7-9 or December 14-16.

And then Linda Kaufman of Bell Labs started with Level 2 BLAS routine deletions and Table 2.4 (matrix vector operations):

Discussion on triangular solve functionality, differs between Dense chapter and Sparse chapter. Let's keep x <- alpha*T^(-1)*x, and y <-- alpha*T^(-1)*x + beta*y (for sparse).

Discussion then began on the rank1 updates...

Gary Howell then stood up and addressed the SY_MVER and TR_MV2 routines, and their need in the functionality tables. We need an easier description that encapsulates w=Ax and z=A^T*y.

And then Linda continued with Table 2.4:

Break at 10:30am. After the break, we would then address the Sparse chapter and the generic interface, Tables 2.5-2.8, Interval BLAS, Extended Precision BLAS, storage format, and the date for the next meeting.

At 10:50am we addressed the Sparse chapter. There are five Level 1 operations, nine storage formats, and two Level 2/3 operations. A total of 171 functions per precision. Roldan Pozo of NIST proposed a modified layout for Sparse BLAS, so that we only have 13 routines (down from 171), we can initialize directly from 9 formats, same interface for C and Fortran, no DESCRA, no get/set matrix-property functions, does not require C/F90 structures, can add "Lite" interface later with no mods, and integrates User/Toolkit Level. He presented an example of matrix vector multiply with coordinate storage.

Linda Kaufman of Bell Labs proposed that we need to add a new storage format (symmetric structure) for finite element methods. Jim Demmel of UC Berkeley seconded that proposal.

Roldan then addressed the issue of generic interfaces. It is more difficult for dense matrices, since you would need to create a new handle for each submatrix access. Generic versions of MV and MM would apply to dense matrices too. And the subject of "hinting" functions was discussed. Do we want this functionality in low-level computational kernels? Do we want two levels of interface? Low-level and generic interface?

Clint Whaley of UT proposed that for performance considerations, if we have a high-level generic interface, we would need to specify the order of magnitude (1 time, 1000 times) that an operation will happen. You can optimize better if given that amount of information. And then the question arose if this flag should be a number or a yes/no flag.

A formal vote was suggested on a generic interface to only the sparse creation routine. Does it apply to dense? Some attendees thought it only applied to sparse because we only do it on creation. There is no creation routine for dense.

A formal vote was then suggested on whether to have handles for sparse matrices. Do we allow both interfaces to be visible to the user, or just the top-level generic interface?

At 12Noon we took a short break for lunch.

At 12:30pm, Chenyi Hu of Univ of Houston talked about the comments he received on the Interval BLAS chapter. He addressed the addition of more references and functionality/performance versus simplicity.

In the current document, we do not address the storage of interval endpoints. The reference implementation stores all left endpoints and then stores all right endpoints. We need more information. It complicates the interface and complicates Levels 2 and 3 even more. Currently there is a C++ reference implementation, but since C++ was voted out of this document, he will produce a C interface. A Fortran95 interface is also in progress.

Jack briefly spoke at 12:55pm to solidify the date and place of the next meeting. UT or Berkeley? The date was set for Dec 14,15,16, and will be hosted by the Univ of Tennessee in Knoxville. And what about updating the document? And the reference implementations? The tables in Chapter 1 will have a column to specify applicability to dense, sparse, extended, or interval BLAS. UT, NAG, and Bell Labs are responsible for updating Chapters 1 and 2 of the document. NIST and Sandia will update the sparse chapter. UC Berkeley will updated the Extended Precision chapter, and the Univ of Houston will update the Interval BLAS chapter. NAG is also responsible for updating the "Fortran 95 interface to the Legacy BLAS" section of the appendix.

We will require a reference implementation and test suite for all proposed routines. Should we set a deliverable date for the reference implementations? We should have something by the March SIAM meeting in San Antonio, if at all possible. There will be a minisymposium for the 25-th anniversary of the BLAS at that meeting.

Jim Demmel gave a brief (2 min.) presentation of recently developed test cases for the Extended Precision routines, which involve using Hadamard matrices to generate an operation exhibiting perfect cancellation.

A formal vote was called on the storage format question (for dense matrices):

Formal votes on functionality from Table 2.5:

Formal votes on functionality from Table 2.7:

Voting was deferred on Multiple Instance functionality (Table 2.8).

Formal votes on functionality from Table 2.6:

Outline of some outstanding "homework":

The meeting adjourned at 3:00pm.

List of attendees:

Attendees list for the October 8-9, 1998 BLAST Forum Meeting

Susan Blackford      UT, Knoxville
Jim Demmel           UC Berkeley
Jack Dongarra        UT / ORNL  
Bruce Greer          Intel      
Sven Hammarling      NAG, UK    
Gary Howell          Florida Tech
Chenyi Hu            Univ of Houston
Hsin-Ying Lin        HP Convex Tech. Ctr.
Linda Kaufman        Bell Labs  
Kristi Maschhoff     Tera Computer
Antoine Petitet      UT, Knoxville
Roldan Pozo          NIST       
Karin Remington      NIST       
Clint Whaley         UT, Knoxville
Chao Yang            NEC        

Susan Blackford and Karin Remington agreed to take minutes for the meeting.