Index for University of Tennessee Technical Reports

<title>University of Tennessee Technical Reports</title>
<h1>Index for University of Tennessee Technical Reports</h1>

Click <A HREF="http://www.netlib.org/master_counts2.html#tennessee">here</A> to see the number of accesses to this library.
<p>
<hr>
<form action="http://netlib2.cs.utk.edu/cgi-bin/wais_tennessee.pl">
This is a searchable index.  Enter search keywords: 
<input name="isindex" size=40>
</form>
<hr>

<pre>
file	<a href="ut-cs-91-131.ps">ut-cs-91-131.ps</a>
by	Ed Anderson, Z. Bai &amp; Jack Dongarra,
title	LAPACK Working Note 31:  Generalized QR Factorization 
,	and Its Applications,
ref	University of Tennessee Technical Report CS-91-131, 
,	April 1991.
for	The purpose of this paper is to reintroduce the 
,	generalized QR factorization with or without pivoting 
,	of two matrices A and B having the same number of
,	rows.  When B is square and nonsingular, the 
,	factorization implicitly gives the orthogonal 
,	factorization of B{-1}A.  Continuing the work of 
,	Paige [20] and Hammarling [12], we discuss the 
,	different forms of the factorization from the point of 
,	view of general-purpose software development.  In 
,	addition, we demonstrate the applications of the GQR 
,	factorization in solving the linear equality 
,	constrained least squares problem and the generalized 
,	linear regression problem, and in estimating the 
,	conditioning of these problems.

file	<a href="sc91.ps">sc91.ps</a>
by	Adam Beguelin, Jack J. Dongarra, G.A. Geist, Robert 
,	Manchek, &amp; V.S. Sunderam,
title	Graphical Development Tools for Network-Based 
,	Concurrent Supercomputing,
ref	Proceedings of Supercomputing `91, pp. 435-444, 
,	Albuquerque, New Mexico, November 1991.
for	This paper describes an X-window based software 
,	environment called HeNCE (Heterogeneous Network 
,	Computing Environment) designed to assist scientists 
,	in developing parallel programs that run on a network 
,	of computers.  HeNCE is built on top of a software 
,	package called PVM which supports process management
,	and communication between a network of heterogeneous 
,	computers.  HeNCE is based on a parallel programming 
,	paradigm where an application program can be described 
,	by a graph.  Nodes of the graph represent subroutines 
,	and the arcs represent data dependencies.  HeNCE is 
,	composed of integrated graphical tools for creating, 
,	compiling, executing, and analyzing HeNCE programs.

file	<a href="ut-cs-91-136.ps">ut-cs-91-136.ps</a>
by	Adam Beguelin, Jack Dongarra, Al Geist, Robert 
,	Manchek, &amp; Vaidy Sunderam,
title	A Users' Guide to PVM Parallel Virtual Machine,
ref	University of Tennessee Technical Report CS-91-136, 
,	July 1991.
for	This report is the PVM version 2.3 users' guide.  It 
,	contains an overview of PVM and how it is installed 
,	and used.  Example programs in C and Fortran are 
,	included.
,
,	PVM stands for Parallel Virtual Machine.  It is a 
,	software package that allows the utilization of a 
,	heterogeneous network of parallel and serial computers 
,	as a single computational resource.  PVM consists of 
,	two parts:  a daemon process that any user can install 
,	on a machine, and a user library that contains 
,	routines for initiating processes on other machines, 
,	for communicating between processes, and synchronizing 
,	processes.

file	<a href="ornl-tm-11850.ps">ornl-tm-11850.ps</a>
by	Jean R.S. Blair &amp; Barry W. Peyton,
title	On Finding Minimum-Diameter Clique Trees,
ref	Oak Ridge National Laboratory Technical Report 
,	ORNL/TM-11850, Oak Ridge National Laboratory, Oak 
,	Ridge, Tennessee, August 1991. 
for	It is well-known that any chordal graph can be 
,	represented as a clique tree (acyclic hypergraph, 
,	join tree).  Since some chordal graphs have many 
,	distinct clique tree representations, it is 
,	interesting to consider which one is most desirable 
,	under various circumstances.  A clique tree of minimum 
,	diameter (or height) is sometimes a natural candidate 
,	when choosing clique trees to be processed in a 
,	parallel computing environment.  
,
,	This paper introduces a linear time algorithm for 
,	computing a minimum-diameter clique tree.  The new 
,	algorithm is an analogue of the natural greedy 
,	algorithm for rooting an ordinary tree in order to 
,	minimize its height.  It has potential application in 
,	the development of parallel algorithms for both 
,	knowledge-based systems and the solution of sparse 
,	linear systems of equations.  

file	<a href="ornl-tm-12318.ps">ornl-tm-12318.ps</a>
by	Jack J. Dongarra, Thomas H. Rowan, and Reed C. Wade
title	Software Distribution Using XNETLIB
ref	Oak Ridge National Laboratory Technical Report ORNL/TM-12318
,	June, 1993
for	Xnetlib is a new tool for software distribution.  Whereas its
,	predecessor netlib uses e-mail as the user interface
,	to its large collection of public-domain mathematical software,
,	Xnetlib uses an X Window interface and socket-based communication.
,	Xnetlib  makes it easy to search through a large distributed 
,	collection of software and to retrieve requested software in seconds.

file	<a href="ut-cs-91-141.ps">ut-cs-91-141.ps</a>
by	James Demmel, Jack Dongarra, &amp; W. Kahan,
title	LAPACK Working Note 39:  On Designing Portable High 
,	Performance Numerical Libraries,
ref	University of Tennessee Technical Reports CS-91-141, 
,	July 1991.
for	High quality portable numerical libraries have existed 
,	for many years.  These libraries, such as LINPACK and 
,	EISPACK, were designed to be accurate, robust, 
,	efficient and portable in a Fortran environment of 
,	conventional uniprocessors, diverse floating point 
,	arithmetics, and limited input data structures.  
,	These libraries are no longer adequate on modern high 
,	performance computer architectures.  We describe their 
,	inadequacies and how we are addressing them in the 
,	LAPACK project, a library of numerical linear algebra 
,	routines designed to supplant LINPACK and EISPACK.  We 
,	shall now show how the new architectures lead to 
,	important changes in the goals as well as the methods 
,	of library design.

file	<a href="ut-cs-89-85.ps">ut-cs-89-85.ps</a>
by	Jack J. Dongarra, 
title	Performance of Various Computers Using Standard Linear 
,	Equations Software,
ref	University of Tennessee Technical Report CS-89-85, 
,	December 1990.
for	This report compares the performance of different 
,	computer systems in solving dense systems of linear 
,	equations.  The comparison involves approximately a
,	hundred computers, ranging from a CRAY-MP to 
,	scientific workstations such as the Apollo and Sun to 
,	IBM PCs.

file	<a href="ut-cs-91-134.ps">ut-cs-91-134.ps</a>
by	Jack Dongarra,
title	LAPACK Working Note 34:  Workshop on the BLACS,
ref	University of Tennessee Technical Report CS-91-134, 
,	May 1991.
for	Forty-three people met on March 28, 1991, to discuss a 
,	set of Basic Linear Algebra Communication Subprograms 
,	(BLACS).  This set of routines is motivated by the 
,	needs of distributed memory computers.

file	<a href="pc.v17.10.ps">pc.v17.10.ps</a>
by	Jack Dongarra, Mark Furtney, Steve Reinhardt, &amp; 
,	Jerry Russell,
title	Parallel Loops -- A Test Suite for Parallelizing 
,	Compilers:  Description and Example Results,
ref	Parallel Computing 17 (1991), pp. 1247-1255.
for	Several multiprocessor systems are now commercially 
,	available, and advances in compiler technology provide 
,	automatic conversion of programs to run on such 
,	systems.  However, no accepted measure of this 
,	parallel compiler ability exists. This paper presents 
,	a test suite of subroutines and loops, called Parallel 
,	Loops, designed to (1) measure the ability of 
,	parallelizing compilers to convert code to run in 
,	parallel and (2) determine how effectively parallel 
,	hardware and software work together to achieve high 
,	performance across a range of problem sizes.  In 
,	addition, we present the results of compiling this 
,	suite using two commercially available parallelizing 
,	Fortran compilers, Cray and Convex.

file	<a href="ut-cs-91-146.ps">ut-cs-91-146.ps</a>
by	Jack Dongarra &amp; Bill Rosener,
title	NA-NET: Numerical Analysis NET,
ref	University of Tennessee Technical Report CS-91-146, 
,	September 1991.
for	The NA-NET is a mail facility created to allow 
,	numerical analysts (na) an easy method of 
,	communicating with one another.  The main advantage of 
,	the NA-NET is uniformity of addressing.  All mail is 
,	addressed to the Internet host ``na-net.ornl.gov'' at 
,	Oak Ridge National Laboratory.  Hence, members of the 
,	NA-NET do not need to remember complicated addresses 
,	or even where a member is currently located.  This 
,	paper describes the software.

file	<a href="ut-cs-91-137.ps">ut-cs-91-137.ps</a>
by	Jack J. Dongarra &amp; Majed Sidani,
title	A Parallel Algorithm for the Non-Symmetric Eigenvalue 
,	Problem,
ref	University of Tennessee Technical Report CS-91-137, 
,	July 30, 1991.
for	This paper describes a parallel algorithm for 
,	computing the eigenvalues and eigenvectors of a 
,	non-symmetric matrix.  The algorithm is based on a 
,	divide-and-conquer procedure and uses an iterative 
,	refinement technique.

file	<a href="ut-cs-91-138.ps">ut-cs-91-138.ps</a>
by	Jack Dongarra &amp; Robert A. van de Geijn,
title	LAPACK Working Note 37:  Two Dimensional Basic Linear 
,	Algebra Communication Subprograms,
ref	University of Tennessee Technical Report CS-91-138, 
,	October 28, 1991.
for	In this paper, we describe extensions to a proposed 
,	set of linear algebra communication routines for 
,	communicating and manipulating data structures that
,	are distributed among the memories of a distributed 
,	memory MIMD computer.  In particular, recent 
,	experience shows that higher performance can be 
,	attained on such architectures when parallel dense 
,	matrix algorithms utilize a data distribution that 
,	views the computational nodes as a logical two 
,	dimensional mesh.  The motivation for the BLACS 
,	continues to be to increase portability, efficiency 
,	and modularity at a high level.  The audience of the 
,	BLACS are mathematical software experts and people 
,	with large scale scientific computation to perform.  
,	A systematic effort must be made to achieve a de facto 
,	standard for the BLACS.

file	<a href="ut-cs-91-130.ps">ut-cs-91-130.ps</a>
by	Jack Dongarra &amp; Robert A. van de Geijn,
title	Reduction to Condensed Form for the Eigenvalue Problem 
,	on Distributed Memory Architectures,
ref	University of Tennessee Technical Report CS-91-130, 
,	April 30, 1991.
for	In this paper, we describe a parallel implementation 
,	for the reduction of general and symmetric matrices to 
,	Hessenberg and tridiagonal form, respectively.  The 
,	methods are based on LAPACK sequential codes and use a 
,	panel-wrapped, mapping of matrices to nodes.  Results 
,	from experiments on the Intel Touchstone Delta are 
,	given.

file	<a href="icci91.ps">icci91.ps</a>
by	Eric S. Kirsch &amp; Jean R.S. Blair,
title	Practical Parallel Algorithms for Chordal Graphs,
ref	pp. 372-382 in Proceedings of the International 
,	Conference on Computing and Information (ICCI '91)--
,	Advances in Computing and Information, Ottawa, Canada, 
,	May 1991. 
for	Until recently, a large majority of theoretical work 
,	in parallel algorithms has ignored communication costs 
,	and other realities of parallel computing.  This paper 
,	attempts to address this issue by developing parallel 
,	algorithms that not only are efficient using standard 
,	theoretical analysis techniques, but also require a 
,	minimal amount of communication.  The specific 
,	parallel algorithms developed here include one to find 
,	the set of maximal cliques and one to find a perfect 
,	elimination ordering of a chordal graph.

file	<a href="vector.ps">vector.ps</a>
by	David Levine, David Callahan, &amp; Jack Dongarra,
title	A Comparative Study of Automatic Vectorizing Compilers,
ref	Parallel Computing 17 (1991), pp. 1223-1244.
for	We compare the capabilities of several commercially 
,	available, vectorizing Fortran compilers using a test 
,	suite of Fortran loops.  We present the results of 
,	compiling and executing these loops on a variety of 
,	supercomputers, mini-supercomputers, and mainframes.

file	<a href="ut-cs-91-147.ps">ut-cs-91-147.ps</a>
by	Bruce MacLennan,
title	Characteristics of Connectionist Knowledge 
,	Representation,
ref	University of Tennessee Technical Report CS-91-147, 
,	November 1991.
for	Connectionism--the use of neural networks for 
,	knowledge representation and inference--has profound 
,	implications for the representation and processing of
,	information because it provides a fundamentally new 
,	view of knowledge.  However, its progress is impeded 
,	by the lack of a unifying theoretical construct 
,	corresponding to the idea of a calculus (or formal 
,	system) in traditional approaches to knowledge 
,	representation.  Such a construct, called a simulacrum,
,	is proposed here, and its basic properties are 
,	explored.  We find that although exact classification 
,	is impossible, several other useful, robust kinds of 
,	classification are permitted.  The representation of 
,	structured information and constituent structure are 
,	considered, and we find a basis for more flexible 
,	rule-like processing than that permitted by 
,	conventional methods.  We discuss briefly logical 
,	issues such as decidability and computability and show 
,	that they require reformulation in this new context.  
,	Throughout we discuss the implications for artificial 
,	intelligence and cognitive science of this new 
,	theoretical framework.

file	<a href="ut-cs-91-145.ps">ut-cs-91-145.ps</a>
by	Bruce MacLennan,
title	Continuous Symbol Systems:  The Logic of Connectionism,
ref	University of Tennessee Technical Report CS-91-145, 
,	September 1991. 
for	It has been long assumed that knowledge and thought 
,	are most naturally represented as discrete symbol 
,	systems (calculi).  Thus a major contribution of 
,	connectionism is that it provides an alternative model 
,	of knowledge and cognition that avoids many of the 
,	limitations of the traditional approach.  But what 
,	idea serves for connectionism the same unifying role 
,	that the idea of a calculus served for the traditional 
,	theories?  We claim it is the idea of a continuous 
,	symbol system.
,
,	This paper presents a preliminary formulation of 
,	continuous symbol systems and indicates how they may 
,	aid the understanding and development of connectionist 
,	theories.  It begins with a brief phenomenological 
,	analysis of the discrete and continuous; the aim of 
,	this analysis is to directly contrast the two kinds of 
,	symbols systems and identify their distinguishing 
,	characteristics.  Next, based on the phenomenological 
,	analysis and on other observations of existing 
,	continuous symbol systems and connectionist models, I 
,	sketch a mathematical characterization of these 
,	systems.  Finally the paper turns to some applications 
,	of the theory and to its implications for knowledge 
,	representation and the theory of computation in a 
,	connectionist context.  Specific problems addressed 
,	include decomposition of connectionist spaces, 
,	representation of recursive structures, properties of 
,	connectionist categories, and decidability in 
,	continuous formal systems.

file	<a href="nipt91-panel.ps">nipt91-panel.ps</a>
by	Bruce MacLennan,
title	The Emergence of Symbolic Processes From the 
,	Subsymbolic Substrate,
ref	text of invited panel presentation, International 
,	Symposium on New Information Processing Technologies 
,	`91, Tokyo, Japan, March 13-14, 1991.
for	A central question for the success of neural network 
,	technology is the relation of symbolic processes 
,	(e.g., language and logic) to the underlying 
,	subsymbolic processes (e.g., parallel distributed 
,	implementations of pattern recognition, analogical 
,	reasoning and learning).  This is not simply an issue 
,	of integrating neural networks with conventional 
,	expert system technology.  Human symbolic cognition is 
,	flexible because it is not purely formal, and because 
,	it retains some of the ``softness'' of the subsymbolic 
,	processes.  If we want our computers to be as flexible 
,	as people, then we need to understand the emergence
,	of the discrete and symbolic from the continuous and 
,	subsymbolic.

file	<a href="ut-cs-91-144.ps">ut-cs-91-144.ps</a>
by	Bruce MacLennan,
title	Gabor Representations of Spatiotemporal Visual Images,
ref	University of Tennessee Technical Report CS-91-144, 
,	September 1991.  
for	We review Gabor's Uncertainty Principle and the limits 
,	it places on the representation of any signal.  
,	Representations in terms of Gabor elementary functions 
,	(Gaussian-modulated sinusoids), which are optimal in 
,	terms of this uncertainty principle, are compared with 
,	Fourier and wavelet representations.  We also review 
,	Daugman's evidence for representations based on 
,	two-dimensional Gabor functions in mammalian visual 
,	cortex.  We suggest three-dimensional Gabor elementary 
,	functions as a model for motion selectivity in complex 
,	and hypercomplex cells in visual cortex.  This model 
,	also suggests a computational role for low frequency 
,	oscillations (such as the alpha rhythm) in visual 
,	cortex.

file	<a href="uist91.ps">uist91.ps</a>
by	Brad Vander Zanden, Brad A. Myers, Dario Giuse, &amp; 
,	Pedro Szekely,
title	The Importance of Pointer Variables in Constraint 
,	Models,
ref	pp. 155-164 in Proceedings of UIST '91, ``ACM SIGGRAPH 
,	Symposium on User Interface Software and Technology,'' 
,	Hilton Head, South Carolina, November 11-13, 1991.
for	Graphical tools are increasingly using constraints to 
,	specify the graphical layout and behavior of many 
,	parts of an application.  However, conventional
,	constraints directly encode the objects they reference,
,	and thus cannot provide support for the dynamic 
,	runtime creation and manipulation of application 
,	objects.  This paper discusses an extension to current 
,	constraint models that allows constraints to 
,	indirectly reference objects through pointer variables.
,	Pointer variables permit programmers to create the 
,	constraint equivalent of procedures in traditional 
,	programming languages.  This procedural abstraction 
,	allows constraints to model a wide array of dynamic 
,	application behavior, simplifies the implementation of 
,	structured object and demonstrational systems, and 
,	improves the storage and efficiency of highly 
,	interactive, graphical applications.  It also promotes 
,	a simpler, more effective style of programming than 
,	conventional constraints.  Constraints that use 
,	pointer variables are powerful enough to allow a 
,	comprehensive user interface toolkit to be built for 
,	the first time on top of a constraint system. 

file	<a href="hence.ieee">hence.ieee</a>
title	HeNCE: Graphical Development Tools for Network-Based 
,	Concurrent Computing
by	Adam Beguelin, Jack J. Dongarra, G.A. Geist, Robert Manchek, 
,	Keith Moore, V. S. Sunderam, and Reed Wade.
for	Wide area computer networks have become a basic part of 
,	today's computing infrastructure.
,	These networks connect a variety of machines, presenting an
,	enormous computing resource.  
,	In this project we focus on developing methods
,	and tools which allow a programmer to tap into this resource.  
,	In this talk we describe HeNCE,
,	a tool and methodology under development that assists a
,	programmer in developing programs to execute on a networked group of
,	heterogeneous machines.
, 
,	HeNCE is implemented on top of a system called PVM 
,	(Parallel Virtual Machine).
,	PVM is a software package that allows
,	the utilization of a heterogeneous network of parallel and serial
,	computers as a single computational resource.  PVM provides
,	facilities for spawning, communication, and 
,	synchronization of processes over a
,	network of heterogeneous machines.  While PVM provides the low
,	level tools for implementing parallel programs, 
,	HeNCE provides the programmer
,	with a higher level abstraction for specifying parallelism.

file	<a href="siampvm.ps ">siampvm.ps </a>
by	A. Beguelin, J. Dongarra, A. Geist, R. Manchek &amp; V. Sunderam,
title	Solving Computational Grand Challenges Using a Network of Heterogeneous 
,	Supercomputers 
ref	Proceedings of the Fifth SIAM Conference on Parallel 
,	Processing for Scientific Computing, pp. 596-601, March 25-27, 1991.
for	This paper describes simple experiments connecting a Cray XMP, an 
,	Intel iPSC/860, and a Thinking Machines CM2 together over a high 
,	speed network to form a much larger virtual computer.  It also 
,	describes our experience with running a Computational Grand Challenge 
,	on a Cray XMP and an iPSC/860 combination.  The purpose of the 
,	experiments is to demonstrate the power and flexibility of the PVM 
,	(Parallel Virtual Machine) system to allow programmers to exploit a 
,	diverse collection of the most powerful computers available to solve Grand 
,	Challenge problems.

file	<a href="ut-cs-92-168.ps">ut-cs-92-168.ps</a>
by	Jack J. Dongarra &amp; H.A. Van der Vorst,
title	Performance of Various Computers Using Standard Sparse Linear Equations 
,	Solving Techniques 
ref	University of Tennessee Technical Report CS-92-168, 
,	February 1992.  
for	The LINPACK benchmark has become popular in the past few years 
,	as a means of measuring floating-point performance on computers.  
,	The benchmark shows in simple and direct way what performance is to 
,	be expected for a range of machines when doing dense matrix computations.  
,	We present performance results of sparse matrix computations which 
,	is an iterative approach.

file	<a href="ut-cs-92-154.ps ">ut-cs-92-154.ps </a>
by	Bruce MacLennan 
title	$L_p$-Circular Functions 
ref	University of Tennessee Technical Report CS-92-154, May 1992.
for	In this report we develop the basic properties of a set of 
,	functions analogous to the circular and hyperbolic functions, 
,	but based on $L_p$ circles.  The resulting identities may simplify 
,	analysis in $L_p$ spaces in much the way that the circular functions 
,	do in Euclidean space.  In any case, they are a pleasing example of 
,	mathematical generalization.

file	<a href="ut-cs-92-172.ps ">ut-cs-92-172.ps </a>
by	Bruce J. MacLennan
title	Research Issues in Flexible Computing:  Two Presentations in Japan
ref	University of Tennessee Technical Report CS-92-172, September 1992.
for	This report contains the text of two presentations made in 
,	Japan in 1991, both of which deal with the Japanese ``Real World 
,	Computing Project'' (previously known as the ``New Information 
,	Processing Technology,'' and informally as the ``Sixth Generation Project'').

file	<a href="ut-cs-92-174.ps ">ut-cs-92-174.ps </a>
by	Bruce MacLennan
title	Field Computation in the Brain 
ref	University of Tennessee Technical Report 
,	CS-92-174, October 1992.
for	We begin with a brief consideration of the {\it topology of knowledge}.  
,	It has traditionally been assumed that true knowledge must be represented 
,	by discrete symbol structures, but recent research in psychology, 
,	philosophy and computer science has shown the fundamental importance of 
,	{\it subsymbolic} information processing, in which knowledge is represented 
,	in terms of very large numbers--or even continua--of {\it microfeatures}.  
,	We believe that this sets the stage for a fundamentally new 
,	theory of knowledge, and we sketch a theory of continuous information 
,	representation and processing.  Next we consider {\it field computation}, 
,	a kind of continuous information processing that emphasizes spatially 
,	continuous {\it fields} of information.  This is a reasonable 
,	approximation for macroscopic areas of cortex and provides a convenient 
,	mathematical framework for studying information processing at this level.  
,	We apply it also to a linear-systems model of dendritic information 
,	processing.  We consider examples from the visual cortex, 
,	including Gabor and 
,	wavelet representations, and outline field-based theories of sensorimotor 
,	intentions and of model-based deduction.

file	<a href="ut-cs-92-180.ps ">ut-cs-92-180.ps </a>
by	Bruce MacLennan 
title	Information Processing in the Dendritic Net
ref	University of Tennessee Technical Report CS-92-180, October 1992.
for	The goal of this paper is a model of the dendritic net that:  
,	(1) is mathematically tractable, (2) is reasonably true to the 
,	biology, and (3) illuminates information processing in the neuropil.  
,	First I discuss some general principles of mathematical modeling in a 
,	biological context that are relevant to the use of linearity and 
,	orthogonality in our models.  Next I discuss the hypothesis that 
,	the dendritic net can be viewed as a linear field computer.  Then I 
,	discuss the approximations involved in analyzing it as a dynamic, 
,	lumped-parameter, linear system.  Within this basically linear framework 
,	I then present:  (1) the self-organization of matched filters and 
,	of associative memories; (2) the dendritic computation of Gabor and other 
,	nonorthogonal representations; and (3) the possible effects of 
,	reverse current flow in neurons.

file	<a href="oopsla.ps ">oopsla.ps </a>
by	Brad A. Myers, Dario A. Giuse, &amp; Brad Vander Zanden
title	Declarative Programming in a Prototype-Instance System:  Object-Oriented
,	Programming Without Writing Methods 
ref	Sigplan Notices, Vol.~27, 
,	No.~10, October 1992, pp.~184-200.
for	Most programming in the Garnet system uses a declarative style that 
,	eliminates the need to write new methods.  One implication is that 
,	the interface to objects is typically through their data values.  
,	This contrasts significantly with other object systems where writing 
,	methods is the central mechanism of programming.  Four features are 
,	combined in a unique way in Garnet to make this possible:  the use 
,	of a prototype-instance object system with structural inheritance, a 
,	retained-object model where most objects persist, the use of constraints 
,	to tie the objects together, and a new input model that makes writing 
,	event handlers unnecessary.  The result is that code is easier to 
,	write for programmers, and also easier for tools, such as interactive, 
,	direct manipulation interface builders, to generate.

file	<a href="ut-cs-92-152.ps ">ut-cs-92-152.ps </a>
by	Marc D. VanHeyningen &amp; Bruce J. MacLennan,
title	A Constraint Satisfaction Model for Perception of Ambiguous Stimuli
ref	University of Tennessee Technical Report CS-92-152, April 1992.
for	Constraint satisfaction networks are natural models of the interpretation 
,	of ambiguous stimuli, such as Necker cubes.  Previous constraint 
,	satisfaction models have stimulated the initial interpretation of a 
,	stimulus, but have not simulated the dynamics of perception, which 
,	includes the alternation of interpretations and the phenomena known 
,	as bias, adaptation and hysteresis.  In this paper we show that these 
,	phenomena can be modeled by a constraint satisfaction network {\it with 
,	fatigue}, that is, a network in which unit activities decay in time.  
,	Although our model is quite simple, it nevertheless exhibits some 
,	key characteristics of the dynamics of perception.

file	<a href="ut-cs-93-194.ps		">ut-cs-93-194.ps		</a>
by	Michael Berry, Theresa Do, Gavin O'Brien, 
,	Vijay Krishna, &amp; Sowmini Varadhan,
title	SVDPACKC (Version 1.0) User's Guide, 
ref	University of Tennessee Technical Report CS-93-194, 
,	April 1993.
for	SVDPACKC comprises four numerical (iterative) methods 
,	for computing the singular value decomposition (SVD) 
,	of large sparse matrices using ANSI C.  This software 
,	package implements Lanczos and subspace 
,	iteration-based methods for determining several of the 
,	largest singular triplets (singular values and 
,	corresponding left- and right-singular vectors) for 
,	large sparse matrices.  The package has been ported to 
,	a variety of machines ranging from supercomputers to
,	workstations:  CRAY Y-MP, IBM RS/6000-550, 
,	DEC 5000-100, HP 9000-750, SPARCstation 2, and 
,	Macintosh II/fx.  This document {\it (i)} explains each
,	algorithm in some detail, {\it (ii)} explains the 
,	input parameters for each program, {\it (iii)} 
,	explains how to compile/execute each program, and
,	{\it (iv)} illustrates the performance of each method 
,	when we compute lower rank approximations to sparse 
,	{\it term-document} matrices from information 
,	retrieval applications.  A user-friendly software 
,	interface to the package for UNIX-based systems and 
,	the Macintosh II/fx is also described.

file	<a href="ut-cs-93-195.ps">ut-cs-93-195.ps</a>
by	Brian Howard LaRose
title	The Development and Implementation of a Performance
,	Database Server 
ref	University of Tennessee Technical Report CS-93-195, 
,	August 1993.
for	The process of gathering, archiving, and distributing 
,	computer benchmark data is a cumbersome task usually 
,	performed by computer users and vendors with little 
,	coordination.  Most importantly, there is no 
,	publicly-available central depository of performance 
,	data for all ranges of machines:  supercomputers to
,	personal computers.  We present an Internet-accessible 
,	performance database server (PDS) which can be used to 
,	extract current benchmark data and literature.  As an 
,	extension to the X-Windows-based user interface 
,	(Xnetlib) to the Netlib archival system, PDS provides 
,	an on-line catalog of public-domain computer 
,	benchmarks such as the Linpack Benchmark, Perfect 
,	Benchmarks, and the Genesis benchmarks.  PDS does not 
,	reformat or present the benchmark data in any way 
,	which conflicts with the original methodology of any 
,	particular benchmark, and is thereby devoid of any 
,	subjective interpretations of machine performance.
,	We feel that all branches (academic and industrial) of 
,	the general computing community can use this facility 
,	to archive performance metrics and make them readily 
,	available to the public.  PDS can provide a more 
,	manageable approach to the development and support of 
,	a large dynamic database of published performance 
,	metrics.

file	<a href="ut-cs-93-196.ps		">ut-cs-93-196.ps		</a>
by	Douglas J. Sept
title	The Design, Implementation and Performance of a 
,	Queue Manager for PVM 
ref	University of Tennessee Technical Report CS-93-196, 
,	August 1993.
for	The PVM Queue Manager (QM) application addresses
,	some of the load balancing problems associated with 
,	the heterogeneous, multi-user, computing environments 
,	for which PVM was designed.  In such environments, PVM 
,	is not only confronted with the difficulties of 
,	distributing tasks among machines of variable loads, 
,	it must also contend with machines of varying 
,	performance levels in the same virtual machine.  The 
,	QM addresses both of these problems using two 
,	different load balancing techniques, one static, the 
,	other dynamic.  In its simplest (static) mode, the QM 
,	will initiate PVM processes for the user on demand, 
,	taking into account information such as the peak 
,	megaflops/sec and actual load of each machine.  In 
,	addition to the initiation of processes,  the QM will
,	also accept tasks to be completed by a specified PVM 
,	process type. These tasks are shipped to the QM where 
,	they are kept in a FIFO queue.  Worker processes in 
,	the virtual machine send idle messages to the QM when 
,	they are ready for a task, and the QM ships a task to 
,	the process if there is one (of a type matching the 
,	process) in the queue.  The QM also maintains a list 
,	of idle processes and chooses the {\em best} one for 
,	the task, should one arrive when several processes 
,	are idle.  Since faster machines typically send more 
,	idle messages (and receive more tasks) than slower 
,	ones, this provides a level of dynamic load balancing 
,	for the system.  Three applications have already been 
,	implemented using the QM within PVM: a Mandelbrot 
,	image generator, a conjugate-gradient algorithm, and 
,	a map analysis program used in landscape ecology 
,	applications.  Benchmarks of elapsed wall-clock time
,	comparing standard PVM versions with the QM-based 
,	versions demonstrate substantial performance gains for 
,	both methods of load balancing.  When processing a 
,	$1000 \times 1000$ image, for example, the QM-based 
,	Mandelbrot application averaged 63.92 seconds, 
,	compared to 139.62 seconds for the standard PVM 
,	version in a heterogenous network of five 
,	workstations (comprised of Sun4's  and an 
,	IBM RS/6000).

file	<a href="ut-cs-93-197.ps		">ut-cs-93-197.ps		</a>
by	Karen Stoner Minser 
title	Parallel Map Analysis on the CM-5 for Landscape 
,	Ecology Models 
ref	University of Tennessee Technical Report CS-93-197, 
,	August 1993.
for	In landscape ecology, computer modeling is used to 
,	assess habitat fragmentation and its ecological 
,	implications.  Specifically, maps (2-D grids) of 
,	habitat clusters are analyzed to determine numbers, 
,	sizes, and geometry of clusters.  Previous ecological 
,	models have relied upon sequential Fortran-77 programs 
,	which have limited the size and density of maps that 
,	can be analyzed. To efficiently analyze relatively 
,	large maps, we present parallel map analysis software 
,	implemented on the CM-5.  For algorithm development, 
,	random maps of different sizes and densities were 
,	generated and analyzed.  Initially, the Fortran-77 
,	program was rewritten in C, and the sequential cluster 
,	identification algorithm was improved and implemented 
,	as a recursive or nonrecursive algorithm.  The major 
,	focus of parallelization was on cluster geometry using 
,	C with CMMD message passing routines.  Several 
,	different parallel models were implemented: host/node, 
,	hostless, and host/node with vector units (VUs).  All 
,	models obtained some speed improvements when compared 
,	against several RISC-based workstations.  The 
,	host/node model with VUs proved to be the most 
,	efficient and flexible with speed improvements for a 
,	$512\times 512$ map of 187, 95, and 20 over the Sun 
,	Sparc 2, HP 9000-750, and IBM RS/6000-350, 
,	respectively.  When tested on an actual map produced 
,	through remote imagery and used in ecological studies 
,	this same model obtained a speed improvement of 119 
,	over the Sun Sparc 2.

file	<a href="ut-cs-92-157.ps		">ut-cs-92-157.ps		</a>
title	HeNCE: A Users' Guide Version 1.2
by	Adam Beguelin, Jack Dongarra, G. A. Geist, Robert Manchek,
,	Keith Moore, Reed Wade, Jim Plank, and Vaidy Sunderam
ref	University of Tennessee Technical Report CS-92-157
for	HeNCE, Heterogeneous Network Computing Environment,  is a graphical
,	parallel programming environment.  HeNCE provides an easy to use
,	interface for creating, compiling, executing, and debugging parallel
,	programs.  HeNCE programs can be run on a single Unix workstation or
,	over a network of heterogeneous machines, possibly including
,	supercomputers.  This report describes the installation and use of the
,	HeNCE software.

file	<a href="ut-cs-93-191.ps		">ut-cs-93-191.ps		</a>
title	Software Distribution Using XNETLIB
by	Jack Dongarra, Tom Rowan and Reed Wade
ref	University of Tennessee Technical Report CS-93-191
for	Xnetlib is a new tool for software distribution.  Whereas its
,	predecessor netlib uses e-mail as the user interface
,	to its large collection of public-domain mathematical software,
,	Xnetlib uses an X-Window interface and socket-based communication.
,	Xnetlib makes it easy to search through a large distributed collection 
,	of software and to retrieve requested software in seconds.

file	<a href="ut-cs-93-207.ps">ut-cs-93-207.ps</a>
title	Data-parallel Implementations of Map Analysis and Animal Movement
,	for Landscape Ecology Models
by	Ethel Jane Comiskey

file	<a href="ut-cs-93-213.ps">ut-cs-93-213.ps</a>
title	Public International Benchmarks for Parallel Computers
by	assembled by Roger Hockney (chairman) and Michael Berry (secretary)
ref	PARKBENCH Committee: Report-1, November 17, 1993

file	<a href="ornl-tm-11669.ps">ornl-tm-11669.ps</a>
title	Fortran Subroutines for Computing the Eigenvalues and Eigenvectors of
,	a General Matrix by Reduction to General Tridiagonal Form,
by	J. Dongarra, A. Geist, and C. Romine  
ref	ORNL/TM-11669, 1990.
,	(Also appeared as a ACM TOMS Vol. 18, No. 4, Dec 1992, pp 392-400.
for	This paper describes programs to reduce a nonsymmetric matrix to
,	tridiagonal form, compute the eigenvalues of the tridiagonal matrix,
,	improve the accuracy of an eigenvalue, and compute the corresponding
,	eigenvector. The intended purpose of the software is to find a few
,	eigenpairs of a dense nonsymmetric matrix faster and more accurately
,	than previous methods. The performance and accuracy of the new
,	routines are compared to two \eispack\ paths: {\tt RG} and {\tt
,	HQR-INVIT}. The results show that the new routines always more accurate
,	and also faster if less than 20\% of the eigenpairs are needed.

file	<a href="ut-cs-89-90.ps">ut-cs-89-90.ps</a>
title	Advanced Architecture Computers,
by	Jack Dongarra and  Iain S. Duff,
ref	University of Tennessee, CS-89-90, November 1989.
for	We describe the characteristics of several recent computers that
,	employ vectorization or parallelism to achieve high performance
,	in floating-point calculations.
,	We consider both top-of-the-range supercomputers and computers
,	based on readily available and inexpensive basic units.
,	In each case we discuss the architectural base, novel features,
,	performance, and cost.  We intend to update this report 
,	regularly, and to this end we welcome comments.

file	<a href="ornl-tm-12404.ps">ornl-tm-12404.ps</a>
title	Software Libraries for Linear Algebra Computation on High-Performance 
,	Computers 
by	Jack J. Dongarra and David W. Walker 
ref	Oak Ridge National Laboratory, ORNL TM-12404, August, 1993.
for	This paper discusses the design of linear algebra libraries for high
,	performance computers. Particular emphasis is placed on the development
,	of scalable algorithms for MIMD distributed memory concurrent 
,	computers. A brief description of the EISPACK, LINPACK, and LAPACK 
,	libraries is given, followed by an outline of ScaLAPACK, which is a 
,	distributed memory version of LAPACK currently under development. The 
,	importance of block-partitioned algorithms
,	in reducing the frequency of data movement between different levels
,	of hierarchical memory is stressed. The use of such algorithms
,	helps reduce the message startup costs on distributed memory concurrent
,	computers. Other key ideas in our approach are the use of distributed 
,	versions of the Level 3 Basic Linear Algebra Subprograms (BLAS) as 
,	computational building blocks, and the use of Basic
,	Linear Algebra Communication Subprograms (BLACS) as communication
,	building blocks. Together the distributed BLAS and the BLACS can be 
,	used to construct higher-level algorithms, and hide many details of 
,	the parallelism from the application developer. 
,	 
,	The block-cyclic data distribution is described, and adopted as a good 
,	way of distributing block-partitioned matrices. Block-partitioned 
,	versions of the Cholesky and LU factorizations are presented, and 
,	optimization issues associated with the implementation of the LU 
,	factorization algorithm on distributed memory concurrent computers
,	are discussed, together with its performance on the Intel Delta system.
,	Finally, approaches to the design of library interfaces are reviewed.

file	<a href="ut-cs-93-205.ps">ut-cs-93-205.ps</a>
title	HeNCE: A Heterogeneous Network Computing Environment,
by	Adam Beguelin, Jack Dongarra, Al Geist, Robert Manchek, and Keith Moore
ref     University of Tennessee Technical Report CS-93-205
for	Network computing seeks to utilize the aggregate resources
,	of many networked computers to solve a single problem.
,	In so doing it is often possible to obtain supercomputer performance
,	from an inexpensive local area network.
,	The drawback is that network computing is complicated
,	and error prone when done by hand, especially if the computers
,	have different operating systems and data formats and are thus 
,	heterogeneous.
,	 
,	HeNCE (Heterogeneous Network Computing Environment)
,	is an integrated graphical environment for creating and running
,	parallel programs over a heterogeneous collection of computers.
,	It is built on a lower level package called PVM.
,	The HeNCE philosophy of parallel programming is to have the programmer
,	graphically specify the parallelism of a computation and to automate,
,	as much as possible, the tasks of writing, compiling,
,	executing, debugging, and tracing the network computation.
,	Key to HeNCE is a graphical language based on directed graphs
,	that describe the parallelism and data dependencies of an application.
,	Nodes in the graphs represent conventional Fortran or C subroutines
,	and the arcs represent data and control flow.
,	 
,	This paper describes the the present state of HeNCE,
,	its capabilities, limitations, and areas of future research.

file	<a href="ut-cs-93-186.ps">ut-cs-93-186.ps</a>
title	A Proposal for a User-Level, Message-Passing Interface
,	in a Distributed Memory Environment
by	Jack J. Dongarra, Rolf Hempel Anthony J. G. Hey, and David W. Walker
ref     University of Tennessee Technical Report CS-93-186
for	This paper describes Message Passing Interface 1 (MPI1), a
,	proposed library interface standard for supporting point-to-point
,	message passing. The intended standard will be provided with
,	Fortran 77 and C interfaces, and will form the basis of a standard high
,	level communication environment featuring collective communication and
,	data distribution transformations. The standard proposed here provides
,	blocking and nonblocking message passing between pairs of processes, 
,	with message selectivity by source process and message type. Provision
,	is made for noncontiguous messages. Context control provides a 
,	convenient means of avoiding message selectivity conflicts between
,	different phases of an application. The ability to form and manipulate
,	process groups permit task parallelism to be exploited, and is a useful
,	abstraction in controlling certain types of collective communication.

file	<a href="ut-cs-93-214.ps">ut-cs-93-214.ps</a>
by	Message Passing Interface Forum,
title	DRAFT:  Document for a Standard Message-Passing
,	Interface,
ref	University of Tennessee Technical Report CS-93-214,
,	October 1993.
for	The Message Passing Interface Forum (MPIF), with
,	participation from over 40 organizations, has been meeting
,	since January 1993 to discuss and define a set of library
,	interface standards for message passing.  MPIF is not
,	sanctioned or supported by any official standards
,	organization.
,
,	This is a draft of what will become the Final Report,
,	Version 1.0, of the Message Passing Interface Forum.  This
,	document contains all the technical features proposed for
,	the interface.  This copy of the draft was processed by
,	LATEX on October 27, 1993.
,
,	MPIF invites comments on the technical content of MPI, as
,	well as on the editorial presentation in the document.
,	Comments received before January 15, 1994 will be
,	considered in producing the final draft of Version 1.0 of
,	the Message Passing Interface Specification.
,
,	The goal of the Message Passing Interface, simply stated, is
,	to develop a widely used standard for writing
,	message-passing programs.  As such the interface should
,	establish a practical, portable, efficient, and flexible
,	standard for message passing.

file	<a href="ut-cs-93-209.ps">ut-cs-93-209.ps</a>
title	Efficient Communication Operations in Reconfigurable Parallel Computers
by	F. Desprez, A. Ferreira, and B. Tourancheau,
ref	University of Tennessee Technical Report CS-93-209
for	Reconfiguration is largely an unexplored property in the context 
,	of parallel models of computation. However, it is a powerful concept 
,	as far as massively parallel architectures are concerned, because it 
,	overcomes the constraints due to the bissection width arising in 
,	most of distributed memory machines. In this paper, we show how
,	to use reconfiguration in order to improve communication operations 
,	that are widely used in parallel applications. We propose quasi-optimal 
,	algorithms for broadcasting, scattering, gossiping and multi-scattering.

file	<a href="ut-cs-93-208.ps">ut-cs-93-208.ps</a>
title	Trace2au Audio Monitoring Tools for Parallel Programs,
by	Jean-Yves Peterschmitt and Bernard Tourancheau
ref	University of Tennessee Technical Report CS-93-208, August 1993.
for	It is not easy to reach the best performances you can expect of 
,	a parallel computer.  We therefore have to use monitoring programs 
,	to study the performances of parallel programs.  We introduce here 
,	a way to generate sound in real-time on a workstation, with no 
,	additional hardware, and we apply it to such monitoring programs.

file	<a href="ut-cs-93-204.ps">ut-cs-93-204.ps</a>
title	A General Approach to the Monitoring of Distributed Memory MIMD
,	Multicomputers
by	Maurice van Riek, Bernard Tourancheau, Xavier-Francois Vigouroux,
ref	University of Tennessee Technical Report CS-93-204
for	Programs for distributed memory parallel machines are generally 
,	considered to be much more complex than sequential programs.  
,	Monitoring systems that collect runtime information about a program 
,	execution often prove a valuable help in gaining insight in the 
,	behavior of a parallel program and thus can increase its performance.  
,	This report describes in a systematic and comprehensive way the
,	issues involved in the monitoring of parallel programs for distributed 
,	memory systems.  It aims to provide a structured general approach 
,	to the field of monitoring and a guide for further documentation.  
,	First the different approaches to parallel monitoring are presented 
,	and the problems encountered are discussed and classified.  
,	In the second part, the main existing systems are described to provide 
,	the user with a feeling for the possibilities and limitations of 
,	real tools.

file	<a href="ut-cs-93-210.ps">ut-cs-93-210.ps</a>
by	Frederic Desprez, Pierre Fraigniaud, and Bernard Tourancheau
title	Successive Broadcasts on Hypercube,
ref	University of Tennessee Technical Report CS-93-210,
,	August 1993.
for	Broadcasting is an information dissemination problem in
,	which information originating at one node of a communication
,	network must be transmitted to all the other nodes as
,	quickly as possible.  In this paper, we consider the
,	problem in which all the nodes of a network must, by turns,
,	broadcast a distinct message.  We call this problem the
,	successive broadcasts problem.  Successive broadcasts is a
,	communication pattern that appears in several parallel
,	implementations of linear algebra algorithms on distributed
,	memory multicomputers.  Note that the successive broadcasts
,	problem is different from the gossip problem in which all
,	the nodes must perform a broadcast in any order, even
,	simultaneously.  We present an algorithm solving the
,	successive broadcasts problem on hypercubes.  We derive a
,	lower bound on the time of any successive broadcasts
,	algorithms that shows that our algorithm is within a factor
,	of 2 of the optimality.

file	<a href="ut-cs-94-222.ps">ut-cs-94-222.ps</a>
title	Netlib Services and Resources, (Rev. 1)
by	S. Browne, J. Dongarra, S. Green, E. Grosse, K. Moore, T. Rowan, 
,	and R. Wade
ref	University of Tennessee Technical Report CS-94-222,
,	December, 1994.
for	The Netlib repository, maintained by the University of Tennessee and
,	Oak Ridge National Laboratory, contains freely available software,
,	documents, and databases of interest to the numerical, scientific
,	computing, and other communities.  This report includes both the
,	Netlib User's Guide and the Netlib System Manager's Guide, and
,	contains information about Netlib's databases, interfaces, and system
,	implementation. The Netlib repository's databases include
,	the Performance Database, the Conferences Database, and
,	the NA-NET mail forwarding and Whitepages Databases.  A variety of
,	user interfaces enable users to access the Netlib repository in the
,	manner most convenient and compatible with their networking
,	capabilities.  These interfaces include the Netlib email interface,
,	the Xnetlib X Windows client, the netlibget command-line TCP/IP
,	client, anonymous FTP, anonymous RCP, and gopher.

file	<a href="ut-cs-94-226.ps">ut-cs-94-226.ps</a>
by	Makan Pourzandi and Bernard Tourancheau
title	A Parallel Performance Study of Jacobi-like
,	Eigenvalue Solution
ref	University of Tennessee Technical Report CS-94-226,
,	March 1994.
for	In this report we focus on Jacobi like resolution of
,	the eigen-problem for a real symmetric matrix from a
,	parallel performance point of view:  we try to optimize the
,	algorithm working on the communication intensive part of the
,	code.  We discuss several parallel implementations and
,	propose an implementation which overlaps the communications
,	by the computations to reach a better efficiency.  We show
,	that the overlapping implementation can lead to significant
,	improvements.  We conclude by presenting our future work.

file	<a href="ut-cs-94-229.ps">ut-cs-94-229.ps</a>
by	James C. Browne, Jack Dongarra, Syed I. Hyder, Keith
,	Moore, and Peter Newton,
title	Visual Programming and Parallel Computing
ref	University of Tennessee Technical Report CS-94-229,
,	April 1994.
for	Visual programming arguably provides greater benefit
,	in explicit parallel programming,  particularly coarse grain
,	MIMD programming, than in sequential programming.
,	Explicitly parallel programs are multi-dimenstioal objects;
,	the natural representations of a parallel program are
,	annotated directed graphs:  data flow graphs, control flow
,	graphs, etc. where the nodes of the graphs are sequential
,	computations.  The execution of parallel programs is a
,	directed graph of instances of sequential computations.  A
,	visually based (directed graph) representation of parallel
,	programs is thus more natural than a pure text string
,	language where multi-dimensional structures must be
,	implicitly defined.  The naturalness of the annotated
,	directed graph representation of parallel programs enables
,	methods for programming and debugging which are
,	qualitatively different and arguably superior to the
,	conventional practice based on pure text string languages.
,	Annotation of the graphs is a critical element of a
,	practical visual programming system; text is still the best
,	way to represent many aspects of programs.
,
,	This paper presents a model of parallel programming and a
,	model of execution for parallel programs which are the
,	conceptual framework for a complete visual programming
,	environement including capture of parallel structure,
,	compilation and behavior analysis (performance and
,	debugging).  Two visually-oriented parallel programming
,	systems, CODE 2.0 and HeNCE, each based on a variant of the
,	model of programming, will be used to illustrate the
,	concepts.  The benefits of visually-oriented realizations of
,	these models for program structure capture, software
,	component reuse, performance analysis and debugging will be
,	explored and hopefully demonstated by examples in these
,	representations.  It is only by actually implementing and
,	using visual parallel programming languages that we have
,	been able to fully evaluate their merits.

file	<a href="ut-cs-94-230.ps">ut-cs-94-230.ps</a>
by	Message Passing Interface Forum,
title	MPI: A Message-Passing Interface Standard,
ref	University of Tennessee Technical Report CS-94-230,
,	April 1994.
for	The Message Passing Interface Forum (MPIF), with
,	participation from over 40 organizations, has been meeting
,	since November 1992 to discuss and define a set of library
,	standards for message passing.  MPIF is not sanctioned or
,	supported by any official standards organization.
,	
,	The goal of the Message Passing Interface, simply
,	stated, is to develop a widely used standard for writing
,	message-passing programs.  As such the interface should
,	establish a practical, portable, efficient and flexible
,	standard for message passing. 
,	
,	This is the final report, Version 1.0, of the
,	Message Passing Interface Forum.  This document contains all
,	the technical features proposed for the interface.  This
,	copy of the draft was processed by LATEX on April 21, 1994.
,	
,	Please send comments on MPI to mpi-comments@cs.utk.edu.
,	Your comment will be forwarded to MPIF committee members who
,	will attempt to respond.

file	<a href="ut-cs-94-232.ps">ut-cs-94-232.ps</a>
by	Robert J. Manchek,
title	Design and Implementation of PVM Version 3,
ref	University of Tennessee Technical Report CS-94-232,
,	May 1994.
for	There is a growing trend toward distributed
,	computing - writing programs that run across multiple
,	networked computers - to speed up computation, solve larger
,	problems or withstand machine failures.  A programming model
,	commonly used to write distributed applications is
,	message-passing, in which a program is decomposed into
,	distinct subprograms that communicate and synchronize with
,	one another by explicitly sending and receiving blocks of
,	data.
,	
,	PVM (Parallel Virtual Machine) is a generic message-passing
,	system composed of a programming library and manager
,	processes.  It ties together separate physical machines
,	(possibly of different types), providing communication and
,	control between the subprograms and detection of machine
,	failures.  The resulting virtual machine appears as a
,	single, manageable source.  PVM is portable to a wide
,	variety of machine architectures and operating systems,
,	including workstations, supercomputers, PCs and
,	multiprocessors.
,	
,	This paper describes the design, implementation and
,	testing of version 3.3 of PVM and surveys related works.

file	<a href="ut-cs-94-261.ps">ut-cs-94-261.ps</a>
by	Peter Newton &amp; Jack Dongarra,
title	Overview of VPE: A Visual Environment for Message-Passing 
,	Parallel Programming,
ref	University of Tennessee Technical Report CS-94-261, 
,	November 1994.
for	This document introduces the VPE parallel programming
,	environment as it was first conceived.
,
,	VPE is a visual parallel programming environment for
,	message-passing parallel computing and is intended to
,	provide a simple human interface to the process of
,	creating message-passing programs. Programmers describe
,	the process structure of a program by drawing a graph in
,	which nodes represent processes and messages flow on arcs
,	between nodes. They then annotate these computation nodes
,	with program text expressed in C or Fortran which
,	contains simple message-passing calls. The VPE
,	environment can then automatically compile, execute, and
,	animate the program. VPE is designed to be implemented on
,	top of standard message-passing libraries such as PVM and
,	MPI.

file	<a href="vp.ps ">vp.ps </a>
by	Bruce J. MacLennan,
title	Visualizing the Possibilities, Commentary, Behavioral and Brain
ref	Sciences (1993) 16:2.
for	I am in general agreement with Johnson-Laird \&amp; Byrne's (J-L \&amp; B's)
,	approach and find their experiments convincing:  therefore my commentary
,	will be limited to several suggestions for extending and refining their
,	theory.

file	<a href="ipdn.ps ">ipdn.ps </a>
by	Bruce MacLennan,
title	Information Processing in the Dendritic Net 
ref	Ch. 6 of 
,	Rethinking Neural Networks:  Quantum Fields \&amp; Biological Data, 
,	Karl H. Pribram, ed., Lawrence Erlbaum Associates, Publishers, 1993,
,	pp.~161-197.
for	The goal of this paper is a model of the dendritic net that:  (1) is 
,	mathematically tractable, (2) is reasonably true to the biology, and 
,	(3) illuminates information processing in the neuropil.  First I 
,	discuss some general principles of mathematical modeling in a 
,	biological context that are relevant to the use of linearity 
,	and orthogonality in our models.  Next I discuss the hypothesis that 
,	the dendritic net can be viewed as a linear field computer.  Then 
,	I discuss the approximations involved in analyzing it as a dynamic, 
,	lumped-parameter, linear system.  Within this basically linear 
,	framework I then present:  (1) the self-organization of matched 
,	filters and of associative memories; (2) the dendritic computation 
,	of Gabor and other nonorthogonal representations; and (3) the 
,	possible effects of reverse current flow in neurons.


file	<a href="fcb.ps ">fcb.ps </a>
by	Bruce MacLennan
title	Field Computation in the Brain, 
ref	Ch. 7 of Rethinking Neural 
,	Networks:  Quantum Fields \&amp; Biological Data, Karl H. Pribram, 
,	ed., Lawrence Erlbaum Associates, Publishers, 1993, pp.~199-232.
for	We begin with a brief consideration of the topology of knowledge.  
,	It has traditionally been assumed that true knowledge must be 
,	represented by discrete symbol structures, but recent research in 
,	psychology, philosophy and computer science has shown the 
,	fundamental importance of subsymbolic information processing, in 
,	which knowledge is represented in terms of very large numbers---or 
,	even continua---of microfeatures.  We believe that this sets the 
,	stage for a fundamentally new theory of knowledge, and we sketch 
,	a theory of continuous information representation and processing.  
,	Next we consider field computation, a kind of continuous information 
,	processing that emphasizes spatially continuous fields of information.  
,	This is a reasonable approximation for macroscopic areas of cortex 
,	and provides a convenient mathematical framework for studying 
,	information processing at this level.  We apply it also to a 
,	linear-systems model of dendritic information processing.  We consider 
,	examples from the visual cortex, including Gabor and wavelet 
,	representations, and outline field-based theories of sensorimotor 
,	intentions and of model-based deduction.

file	<a href="cckr.ps ">cckr.ps </a>
by	Bruce J. MacLennan
title	Characteristics of Connectionist Knowledge Representation, 
ref	Information Sciences 70, pp. 119-143, 1993.
for	Connectionism---the use of neural networks for knowledge 
,	representation and inference---has profound implications for the 
,	representation and processing of information because it provides 
,	a fundamentally new view of knowledge.  However, its progress is 
,	impeded by the lack of a unifying theoretical construct corresponding 
,	to the idea of a calculus (or formal system) in traditional 
,	approaches to knowledge representation.  Such a construct, called a 
,	simulacrum, is proposed here, and its basic properties are explored.  
,	We find that although exact classification is impossible, several 
,	other useful, robust kinds of classification are permitted.  The 
,	representation of structured information and constituent structure are 
,	considered, and we find a basis for more flexible rule-like processing 
,	than that permitted by conventional methods.  We discuss briefly 
,	logical issues such as decidability and computability and show that 
,	they require reformulation in this new context.  Throughout we 
,	discuss the implications of this new theoretical framework for 
,	artificial intelligence and cognitive science.

file	<a href="kohl-contact">kohl-contact</a>
for	Information on where to contact author, James Arthur Kohl.


file	<a href="kohl-93-mascots.ps">kohl-93-mascots.ps</a>
by	T. L. Casavant, J. A. Kohl,
title	"The IMPROV Meta-Tool Design Methodology for Visualization
,	of Parallel Programs,"
ref	Invited Paper, International Workshop on Modeling, Analysis,
,	and Simulation of Computer and Telecommunication Systems (MASCOTS),
,	January 1993.
for	A design methodology is presented that simplifies the creation
,	of program visualization tools while maintaining a high degree
,	of flexibility and expressive power.  The approach is based on a
,	"circulation architecture" model that organizes the details of the
,	user specification, and provides a formal means for indicating
,	relationships.  The overall user specification is divided into
,	independent modules containing distinct, well-defined entities,
,	and the relationships among these module entities are identified
,	using a powerful "mapping language".  This language maps conditions
,	on entities to manipulations that modify entities, resulting in
,	dynamic animations of program behavior.  The mapping language
,	supports arbitrary levels of abstraction providing a full range
,	of detail, and allowing efficient view development.  To demonstrate
,	the feasibility and usefulness of this approach, a specific program
,	visualization meta-tool design, IMPROV, is described.

file	<a href="kohl-92-compsac.tgz">kohl-92-compsac.tgz</a>
by	J. A. Kohl, T. L. Casavant,
title	"A Software Engineering, Visualization Methodology
,	for Parallel Processing Systems,"
ref	Proceedings of the Sixteenth Annual International
,	Computer Software &amp; Applications Conference (COMPSAC),
,	Chicago, Illinois, September 1992, pp. 51-56.
for	This paper focuses on techniques for enhancing the feasibility
,	of using graphic visualization in analyzing the complexities of
,	parallel software.  The central drawback to applying such visual
,	techniques is the overhead in developing analysis tools with flexible,
,	customized views.  The "PARADISE" (PARallel Animated DebuggIng and
,	Simulation Environment) system, which has been in operation since 1989,
,	alleviates some of this design overhead by providing an abstract,
,	object-oriented, visual modeling environment which expedites custom
,	visual tool development.  PARADISE is a visual tool which is used to
,	develop other visual tools, or a "meta-tool".  This paper complements
,	previous work on PARADISE by describing the philosophy behind its
,	design, and how that philosophy leads to a methodology for constructing
,	visual models which characterize parallel systems in general.  Emphasis
,	will be on the crucial issues in utilizing visualization for parallel
,	software development, and how PARADISE deals with these issues.

file	<a href="kohl-92-prop.ps">kohl-92-prop.ps</a>
by	J. A. Kohl,
title	"The Construction of Meta-Tools for Program Visualization
,	of Parallel Software,"
ref	Ph.D. Thesis Proposal,
,	Written Paper Accompanying Oral Comprehensive Examination,
,	Technical Report Number TR-ECE-920204, Department of ECE,
,	University of Iowa, Iowa City, IA, 52242, February 1992.
for	This proposal provides a design methodology for program visualization
,	meta-tools for parallel software that simplifies the use of such tools
,	while maintaining a high degree of flexibility and expressive power.
,	The approach is based on a "meta-tool circulation architecture"
,	model that organizes the details of the user specification, and
,	provides a circulation of information which supports a formal means
,	for indicating relationships among that information.  The overall user
,	specification is divided into independent modules containing distinct
,	entities, and the relationships among these module entities are
,	identified using a powerful "relationship mapping language".  This
,	language maps conditions on selected entities to manipulations that
,	modify the entities, allowing the state of an entity to be controlled
,	in terms of the state of any other entity or itself.  The mapping
,	language supports arbitrary levels of abstraction in manipulating
,	entities, allowing a full range of possible detail.  As a result,
,	visual analyses can be specified efficiently, utilizing only the
,	minimum level of detail necessary.  To demonstrate the feasibility
,	and usefulness of this approach, a specific program visualization
,	meta-tool design is proposed based on the methodology.

file	<a href="kohl-92-ewpc-conf.tgz">kohl-92-ewpc-conf.tgz</a>
by	T. L. Casavant, J. A. Kohl, Y. E. Papelis,
title	"Practical Use of Visualization for Parallel Systems,"
ref	Invited Keynote Address Text for
,	1992 European Workshop on Parallel Computers (EWPC),
,	Barcelona, Spain, March 23-24, 1992.
for	This paper overviews the major contributions to the field of
,	visualization as applied to parallel computing to date.
,	Advances have come mostly from academics, but the influence on
,	industrial and commercial settings for the future will be dramatic.
,	The paper emphasizes how to improve the software development
,	process for high-performance parallel computers through the use of
,	visualization techniques both for program creation, as well as for
,	debugging, verification, performance tuning, and maintenance.
,	A concrete discussion of actual tool behavior is also presented.

file	<a href="kohl-92-ewpc-full.tgz">kohl-92-ewpc-full.tgz</a>
by	T. L. Casavant, J. A. Kohl, Y. E. Papelis,
title	"Practical Use of Visualization for Parallel Systems,"
ref	Technical Report Number TR-ECE-920102, Department of ECE,
,	University of Iowa, Iowa City, IA, 52242,
,	January 1992 (full version of EWPC 92 paper).
for	This paper overviews the major contributions to the field of
,	visualization as applied to parallel computing to date.
,	Advances have come mostly from academics, but the influence on
,	industrial and commercial settings for the future will be dramatic.
,	The paper emphasizes how to improve the software development
,	process for high-performance parallel computers through the use of
,	visualization techniques both for program creation, as well as for
,	debugging, verification, performance tuning, and maintenance.
,	A concrete discussion of actual tool behavior is also presented.

file	<a href="kohl-91-ipps.tgz">kohl-91-ipps.tgz</a>
by	J. A. Kohl, T. L. Casavant,
title	"Use of PARADISE: A Meta-Tool for Visualizing Parallel Systems,"
ref	Proceedings of the Fifth International Parallel Processing
,	Symposium (IPPS),
,	Anaheim, California, May 1991, pp. 561-567.
for	This paper addresses the problem of creating software tools for
,	visualizing the dynamic behavior of parallel applications and systems.
,	"PARADISE" (PARallel Animated DebuggIng and Simulation Environment)
,	approaches this problem by providing a "meta-tool" environment
,	for generating custom visual analysis tools.  PARADISE is a meta-tool
,	because it is a tool which is utilized to create other tools.  This
,	paper focuses on the user's view of the use of PARADISE for
,	constructing tools which analyze the interaction between parallel
,	systems and parallel applications.  An example of its use, involving
,	the PASM Parallel Processing System, is given.

file	<a href="kohl-91-santafe.ps">kohl-91-santafe.ps</a>
by	J. A. Kohl, T. L. Casavant,
title	"Methodologies for Rapid Prototyping of Tools for
,	Visualizing the Performance of Parallel Systems,"
ref	Presentation at Workshop on Parallel Computer Systems: Software Tools,
,	Santa Fe, New Mexico, October 1991.
for	This presentation focuses on the issues encountered in developing
,	visualization tools for performance tuning of parallel software.  This
,	task will be analyzed from the perspective of the user and the "meta-
,	tool" designer.  The talk will emphasize these two perspectives on
,	performance tuning,  as well as another approach which utilizes a
,	limited tool kit.  Then, the current state of the PARADISE tool, a
,	meta-tool for analyzing parallel software, will be examined, along
,	with other visual tools, to determine the extent to which each tool
,	satisfies the goals and guidelines of the previous discussion.
,	Finally, directions for future work will be explored.
,	( Note:  Presentation slides only. )

file	<a href="kohl-91-comp.ps">kohl-91-comp.ps</a>
by	J. A. Kohl,
title	"Visual Techniques for Parallel Processing,"
ref	Written Comprehensive Examination,
,	University of Iowa, Department of Electrical and Computer Engineering,
,	ECETR-910726, July 1991.
for	This Comprehensive Examination consists of an accumulation and
,	analysis of research on the use of visualization in computing systems
,	over the past decade, as well as recent efforts specifically in the
,	area of software development for parallel processing.  The goal of
,	the examination is to determine the relationships among the references
,	located, and their cumulative effect in directing the course of future
,	research in the field of visualization.  The examination includes a
,	creative portion in which the various uses and approaches for
,	visualization are to be classified via a taxonomical system.  This
,	classification will identify the central issues which differentiate
,	the visualization environments for developing parallel software.
,	In addition, a quantitative assessment of these environments will
,	be constructed which presents a more concrete evaluation
,	and categorization technique.

file	<a href="kohl-91-901011.tgz">kohl-91-901011.tgz</a>
by	J. A. Kohl, T. L. Casavant,
title	"PARADISE: A Meta-Tool for Program Visualization in
,	Parallel Computing Systems,"
ref	Technical Report Number TR-ECE-901011, Department of ECE,
,	University of Iowa, Iowa City, IA, 52242,
,	Revised December 1991.
for	This paper addresses the problem of creation of software tools for
,	visualizing the dynamic behavior of parallel applications and systems.
,	"PARADISE" (PARallel Animated DebuggIng and Simulation Environment)
,	approaches this problem by providing a "meta-tool" environment for
,	generating custom visual analysis tools.  PARADISE is a meta-tool
,	because it is a tool which is utilized to create other tools.
,	The fundamental concept is the use of abstract visual models to
,	simulate complex, concurrent behavior.  This paper focuses on the
,	goals of PARADISE, and reflects on the extent to which the prototype
,	system, which has been in operation since 1989, meets these goals.
,	The prototype system is described, along with a methodology for
,	using visual modeling to analyze parallel software and systems.
,	Examples of its use are also given.

file	<a href="ut-cs-94-263.ps">ut-cs-94-263.ps</a>
by	Shirley Browne, Jack Dongarra, Stan Green, Keith Moore, 
,	Tom Rowan, Reed Wade, Geoffrey Fox and Ken Hawick
title	Prototype of the National High-Performance Software Exchange
ref	University of Tennessee Technical Report CS-94-263,
,	December, 1994
for	This report describes a short-term effort to construct a prototype
,	for the National High-Performance Software Exchange (NHSE).
,	The prototype demonstrates how
,	the evolving National Information Infrastructure (NII) can be used
,	to facilitate sharing of software and information among members of the
,	High Performance Computing and Communications (HPCC) community.
,	Shortcomings of current information searching and retrieval tools
,	are pointed out, and recommendations are given for areas in need
,	of further development.
,	The hypertext home page for the NHSE is accessible at
,	http://www.netlib.org/nse/home.html.

file	<a href="ut-cs-95-272.ps">ut-cs-95-272.ps</a>
by	Shirley Browne, Jack Dongarra, Stan Green, Keith
,	Moore, Tom Rowan, Reed Wade, Geoffrey Fox, Ken Hawick, Ken
,	Kennedy, Jim Pool, and Rick Stevens,
title	National HPCC Software Exchange,
ref	University of Tennessee Technical Report CS-95-272,
,	January 1995.
for	This report describes an effort to construct a
,	National HPCC Software Exchange (NHSE).  This system shows
,	how the evolving National Information Infrastructure (NII)
,	can be used to facilitate sharing of software and
,	information among members of the High Performance Computing
,	and Communications (HPCC) community.  To access the system
,	use the URL:  http://www.netlib.org/nse/.

file    <a href="ut-cs-95-274.ps">ut-cs-95-274.ps</a>
by      Jack J. Dongarra, Steve W. Otto, Marc Snir, and
,       David Walker,
title   An Introduction to the MPI Standard,
ref     University of Tennessee Technical Report CS-95-274,
,       January 1995.
for     The Message Passing Interface (MPI) is a portable
,       message-passing standard that facilitates the development of
,       parallel applications and libraries.  The standard defines
,       the syntaax and semantics of a core of library routines
,       useful to a wide range of users writing portable
,       message-passing programs in Fortran 77 or C.  MPI also forms
,       a possible target for compilers of languages such as High
,       Performance Fortran.  Commercial and free, public-domain
,       implementations of MPI already exist.  These run on both
,       tightly-coupled, massively-parallel machines (MPPs), and on
,       networks of workstations (NOWs).
,
,       The MPI standard was developed over a year of intensive
,       meetings and involved over 80 people from approximately 40
,       organizations, mainly from the United States and Europe.
,       Many vendors of concurrent computers were involved, along
,       with researchers from universities, government laboratories,
,       and industry.  This effort culminated in the publication of
,       the MPI specification.  Other sources of information on MPI
,       are available or are under development.
,
,       Researchers incorporated into MPI the most useful features
,       of several systems, rather than choosing one system to adopt
,       as the standard.  MPI has roots in PVM, Express, P4,
,       Zipcode, and Parmacs, and in systems sold by IBM, Intel,
,       Meiko, Cray Research, and Ncube.


file	<a href="ut-cs-95-276.ps">ut-cs-95-276.ps</a>
by	Henri Casanova, Jack Dongarra, Phil Mucci
title	A Test Suite for PVM,
ref	University of Tennessee Technical Report CS-95-276,
,	June 1995.
for	Although PVM is well established in the field of distributed
,	computing, the need has been shown for a standard set of
,	tests to give its users further confidence in the
,	correctness of their installation.  This report introduces
,	pvm_test and its X interface pvm_test_gui.  pvm_test was
,	designed to exercise some of PVM's more important functions
,	and to provide some primitive measures of its performance.

file	<a href="ut-cs-95-277.ps">ut-cs-95-277.ps</a>
by	Philip J. Mucci and Jack Dongarra,
title	Possibilities for Active Messaging in PVM,
ref	University of Tennessee Technical Report CS-95-277,
,	February 1995.
for	Active messaging is a communications model designed
,	around the interaction of a network interface and its
,	driving software in an operating system.  By utilizing this
,	model, the user can design applicatiions that make better
,	use of the available computing and communication resources.
,	Currently, successful implementations exist only for a
,	certain subset of workstations and network adapters.  This
,	paper is an exploration into a portable implementation of
,	active messaging for possbile inclusion to the PVM suite, a
,	generalized framework for distributed computing.

file	<a href="ut-cs-95-278.ps">ut-cs-95-278.ps</a>
title   Location-Independent Naming for Virtual Distributed Software
,       Repositories
by      Shirley Browne, Jack Dongarra, Stan Green,
,       Keith Moore, Theresa Pepin, Tom Rowan, Reed Wade, Eric Grosse
ref	University of Tennessee Technical Report CS-95-278,
,	February 1995.
for     A location-independent naming system for network resources
,       has been designed to facilitate organization and description
,       of software components accessible through a virtual distributed
,       repository.
,       This naming system enables easy and efficient searching and retrieval,
,       and it addresses many of the
,       consistency, authenticity, and integrity issues involved with
,       distributed software repositories by providing mechanisms for
,       grouping resources and for authenticity and integrity checking.
,       This paper details the design of the naming system, describes
,       a prototype implementation of some of the capabilities, and
,       describes how the system fits into the development of the National
,       HPCC Software Exchange, a virtual software repository that has the goal
,       of providing access to reusable software components for
,       high-performance computing.

file	<a href="ut-cs-95-279.ps">ut-cs-95-279.ps</a>
title   Digital Software and Data Repositories for Support of
,       Scientific Computing
by      Ronald Boisvert, Shirley Browne, Jack Dongarra, and Eric Grosse
ref	University of Tennessee Technical Report CS-95-279,
,	February 1995.
for     This paper discusses the special characteristics and needs of
,       software repositories and describes how these needs have been
,       met by some existing repositories.  These repositories include
,       Netlib, the National HPCC Software Exchange,
,       and the GAMS Virtual Repository.
,       We also describe some systems that provide on-line access
,       to various types of scientific data.
,       Finally, we outline a proposal for integrating software and data
,       repositories into the world of digital document libraries, in
,       particular CNRI's ARPA-sponsored Digital Library project.

file    <a href="ut-cs-95-288.html">ut-cs-95-288.html</a>
title   Distributed Information Management in the National HPCC
,       Software Exchange
by      Shirley Browne, Jack Dongarra, Geoffrey C. Fox, Ken Hawick,
,       Ken Kennedy, Rick Stevens, Robert Olson, Tom Rowan
ref	University of Tennessee Technical Report CS-95-288,
,	April 1995.
for     The National HPCC Software Exchange
,       is a collaborative effort by member institutions of the
,       Center for Research on Parallel Computation
,       to provide network access to HPCC-related software, documents,
,       and data.
,       Challenges for the NHSE include identifying, organizing, filtering,
,       and indexing the rapidly growing wealth of relevant information
,       available on the Web.
,       The large quantity of information necessitates performing these
,       tasks using automatic techniques, many of which make use of parallel
,       and distribution computation, but human intervention is needed for
,       intelligent abstracting,
,       analysis, and critical review tasks.  Thus, major goals of
,       NHSE research are to find the right mix of
,       manual and automated techniques, and to leverage the results of
,       manual efforts to the maximum extent possible.  This paper describes
,       our current information gathering and
,       processing techniques, as well as our future plans for integrating
,       the manual and automated approaches.
,       The NHSE home page is accessible at http://www.netlib.org/nse/.

file    <a href="ut-cs-95-287.ps">ut-cs-95-287.ps</a>
title   Management of the NHSE - A Virtual Distributed Digital Library
by      Shirley Browne, Jack Dongarra, Ken Kennedy, Tom Rowan
ref	University of Tennessee Technical Report CS-95-287,
,	April 1995.
for     The National HPCC Software Exchange (NHSE) is a distributed collection
,       of software, documents, and data of interest to the high performance
,       computing community.  Our experiences with the design and initial
,       implementation of the NHSE are relevant to a number of general digital
,       library issues, including the publication process, quality control,
,       authentication and integrity, and information retrieval.
,       This paper describes an authenticated submission process that is
,       coupled with a multilevel review process.
,       Browsing and searching tools for aiding with
,       information retrieval are also described.

file    <a href="ut-cs-95-294.ps">ut-cs-95-294.ps</a>
by      J. J. Dongarra, B. Straughan, and D. W. Walker,
title   Chebyshev tau - QZ Algorithm Methods for Calculating
,       Spectra of Hydrodynamic Stablilty Problems,
ref     University of Tennessee Technical Report CS-95-294,
,       June 1995.
for     The Chebyshev tau method is examined in detail for a
,       variety of eigenvalue problems arising in hydrodynamic
,       stability studies, particularly those of Orr-Sommerfeld
,       type.  We concentrate on determining the whole of the top
,       end of the spectrum in parameter ranges beyond those often
,       explored.  The method employing a Chebyshev representation
,       of the fourth derivative operator, D^4, is compared with
,       those involving the second and first derivative operators,
,       D^2, D, respectively; the latter two representations require
,       use of the QZ algorithm in the resolution of the singular
,       generalised matrix eigenvalue problem which arises.  The D^2
,       method is shown to be different from the stream function -
,       vorticity scheme in certain (important and practical) cases.
,       Physical problems explored are those of Posieuille, Couette,
,       and pressure gradient driven circular pipe flow.  Also
,       investigated are the three-dimensional problem of Posieuille
,       flow arising from a normal velocity - normal vorticity
,       interaction, and finally Couette and Posieuille problems for
,       two viscous, immiscible fluids, one overlying the other are
,       studied.

file    <a href="ut-cs-95-297.ps">ut-cs-95-297.ps</a>
by      Jack J. Dongarra, Hans W. Meuer, and Erich
,       Strohmaier,
title   TOP500 Supercomputer Sites,
ref     University of Tennessee Technical Report CS-95-297,
,       July 1995.
for     To provide a better basis for statistics on
,       high-performance computers, we list the sites that have the
,       500 most powerful computer systems installed.  The best
,       LINPACK benchmark performance achieved is used as a
,       performance measure in ranking the computers.

file    <a href="ut-cs-95-299.ps">ut-cs-95-299.ps</a>
by      Jack J. Dongarra and Tom Dunigan,
title   Message-Passing Performance of Various Computers,
ref     University of Tennessee Technical Report CS-95-299,
,       July 1995.
for     This report compares the performance of different
,       computer systems for basic message passing.  Latency and
,       bandwidth are measured on Convex, Cray, IBM, Intel, KSR,
,       Meiko, nCUBE, NEC, SGI, and TMC multiprocessors.
,       Communication performance is contrasted with the
,       computational power of each system.  The comparison includes
,       both shared and distributed memory computers as well as
,       networked workstation clusters.

file    <a href="ut-cs-95-301.ps">ut-cs-95-301.ps</a>
by      Henri Casanova, Jack Dongarra, and Weicheng Jiang,
title   The Performance of PVM on MPP Systems,
ref     University of Tennessee Technical Report CS-95-301,
,       August 1995.
for     PVM (Parallel Virtual Machine) is a popular standard
,       for writing parallel programs so that they may execute over
,       a network of heterogeneous machines.  This paper presents
,       some performance results of PVM on three massively parallel
,       processing systems:  the Thinking Machines CM-5, the Intel
,       Paragon, and the IBM SP-2.  We describe the basics of the
,       communication model of PVM and its communication routines.
,       We then compare its performance with native message-passing
,       systems on the MPPs.

file    <a href="ut-cs-95-310.ps">ut-cs-95-310.ps</a>
by      Jack Dongarra, Loic Prylli, Cyril Randriamaro, and
,       Bernard Tourancheau,
title   Array Redistribution in ScaLAPACK using PVM,
ref     University of Tennessee Technical Report CS-95-310,
,       October 1995.
for     Linear algebra on distributed-memory parallel
,       computers raises the problem of data distribution of
,       matrices and vectors among the processes.  Block-cyclic
,       distribution works well for most algorithms.  The block size
,       must be chosen carefully, however, in order to achieve good
,       efficiency and good load balancing.  This choice depends
,       heavily on each operation; hence, it is essential to be able
,       to go from one distribution to another very quickly.  We
,       present here the algorithms implemented in the ScaLAPACK
,       library, and we discuss timing results on a network of
,       workstations and on a Cray T3D using PVM.

file    <a href="ut-cs-95-312.ps">ut-cs-95-312.ps</a>
by      Shirley Browne and Tom Rowan,
title   Assessment of the NHSE Software Submission and
,       Review Process,
ref     University of Tennessee Technical Report CS-95-312,
,       November 1995.
for     An NHSE Software submission trial run was conducted
,       to facilitate evaluation of the submission and review
,       process.  This document describes the experiment and
,       assesses the current state of the NHSE software submission
,       and review process.

file  <a href="ut-cs-95-313.ps">ut-cs-95-313.ps</a>
by Henri Casanova and Jack Dongarra,
title NetSolve:  A Network Server for Solving
,  Computational Science Problems,
ref   University of Tennessee Technical Report CS-95-313,
,  November 1995.
for   This paper presents a new system, called NetSolve,
,  that allows users to access computational resources, such as
,  hardware and software, distributed across the network.  This
,  project has been motivated by the need for an easy-to-use,
,  efficient mechanism for using computational resources
,  remotely.  Ease of use is obtained as a result of different
,  interfaces, some of which do not require any programming
,  effort from the user.  Good performance is ensured by a
,  load-balancing policy that enables NetSolve to use the
,  computational resource available as efficiently as possible.
,  NetSolve is designed to run on any heterogeneous network and
,  is implemented as a fault-tolerant client-server
,  application.

file	<a href="ut-cs-96-318.ps">ut-cs-96-318.ps</a>
by	Jack J. Dongarra and Horst D. Simon,
title	High Performance Computing in the U.S. in 1995 - An
,	analysis on the Basis of the TOP500 List,		  
ref	University of Tennessee Technical Report CS-96-318,
,	January 1996.
for	In 1993 for the first time a list of the top 500
,	supercomputer sites worldwide has been made available.  The
,	TOP500 list allows a much more detailed and well founded
,	analysis of the state of high performance computing.
,	Previously data such as the number and geographical
,	distribution of supercomputer installations were difficult
,	to obtain, and only a few anslysts undertook the effort to
,	track the press releases by dozens of vendors.  With the
,	TOP500 report now generally and easily available it is
,	possible to present an analysis of the state of High
,	Performance Computing (HPC) in the U.S.  This note
,	summarizes some of the most important observations about HPC
,	in the U.S. as of late 1995, in particular the continued
,	dominance of the world market in HPC by the U.S., the market
,	penetration by commodity microprocessor based systems, and
,	the growing industrial use of supercomputers.

file    <a href="ut-cs-96-325.ps">ut-cs-96-325.ps</a>
by      Aad J. van der Steen and Jack J. Dongarra,
title   Overview of Recent Supercomputers,
ref     University of Tennessee Technical Report CS-96-325,
,       April 1996.
for     In this report we give an overview of parallel- and
,       vector computers which are currently available or will become
,       available within a short time frame from vendors; no attempt
,       is made to list all machines that are still in the research
,       phase.  The machines are described according to their
,       architectural class.  Shared- and distributed memory SIMD-
,       and MIMD machines are discerned.  The information about each
,       machine is kept as compact as possible.  Moreover, no attempt
,       is made to quote prices as these are often even more elusive
,       than the performance of a system.  This document reflects the
,       technical state of the supercomuter arena as accurately as
,       possible.  However, the authors nor their employers take any
,       responsibility for errors or mistakes in this document.  We
,       encourage anyone who has comments or remarks on the contents
,       to inform us, so we can improve this work.

file    <a href="ut-cs-96-329.ps">ut-cs-96-329.ps</a>
by      Shirley Browne, Jack Dongarra, Kay Hohn, and Tim
,       Niesen,
title   Software Repository Interoperability,
ref     University of Tennessee Technical Report CS-96-329,
,       July 1996.
for     A number of academic, commercial, and government
,       software repositories currently exist that provide access to
,       software packages, reusable software components, and related
,       documents, either via the Internet or via
,       intraorganizational intranets.  It is highly desirable,
,       both for user convenience and savings in duplication of
,       effort, that these repositories interoperate.  This paper
,       describes interoperability standards that have already been
,       developed as well as those under development by the Reuse
,       Library Interoperability Group (RIG).  These standards
,       include a data model for a common semantics for describing
,       software resources, as well as frameworks for describing
,       software certification policies and intellectual property
,       rights.  The National HPCC Software Exchange (NHSE) is
,       described as an example of an organization that is achieving
,       interoperation between government and academic HPCC software
,       repositories, in part through adoption of RIG standards.

file    <a href="ut-cs-96-342.ps">ut-cs-96-342.ps</a>
by      Jack J. Dongarra, Hans W. Meuer, and Erich
,       Strohmaier,
title   TOP500 Report 1996,
ref     University of Tennessee Technical Report CS-97-342,
,       November 1996.
for     This report is a snapshot of the state of
,       supercomputer installations in the world.  It is based on
,       the TOP500 list that was published in November 1996 and
,       includes trends from the previous lists from June 1993 till
,       November 1996.

file    <a href="ut-cs-96-343.ps">ut-cs-96-343.ps</a>
by      Henri Casanova, Jack Dongarra, and Keith Seymour,
title   Client User's Guide to NetSolve
ref     University of Tennessee Technical Report CS-96-343,
,       December 1996.
for     The NetSolve system, developed at the University of
,       Tennessee, is a client-server application designed to solve
,       computational science problems over a network.  Users may
,       access NetSolve computational servers through C, Fortran,
,       MATLAB, or Java interfaces.  This document briefly presents
,       the basics of the system.  It then describes in detail how
,       the different clients can contact the NetSolve system to
,       have some computation performed, thanks to numerous
,       examples.  Complete reference manuals are given in the
,       appendixes.

file    <a href="ut-cs-96-338.ps">ut-cs-96-338.ps</a>
by      Shirley V. Browne and James W. Moore,
title   Reuse Library Interoperability and the World Wide
,       Web,
ref     University of Tennessee Technical Report CS-96-338,
,       October 1996.
for     The Reuse Library Interoperability Group (RIG) was
,       formed in 1991 for the purpose of drafting standards
,       enabling the interoperation of software reuse libraries.  At
,       that time, prevailing wisdom among many reuse library
,       operators was that each should be a stand-alone operation.
,       Many operators saw a need for only a single library, their
,       own, and most strived to provide the most general possible
,       services to appeal to a broad community of users.  The ASSET
,       program, initiated by the Advanced Research Project Agency
,       STARS program, was the first to make the claim that it
,       should properly be one part of a network of interoperating
,       libraries.  Shortly thereafter, the RIG was formed,
,       initially as a collaboration between the STARS program and
,       the Air Force RAASP program, but growing within six months
,       to a self-sustaining cooperation among twelve chartering
,       organizations.  The RIG has grown to include over twenty
,       members from government, industry, and academic reuse
,       libraries.  It has produced a number of technical reports
,       and proposed interoperability standards, some of which are
,       described in this report.

file    <a href="ut-cs-97-346.ps">ut-cs-97-346.ps</a>
by      Keith Moore, Shirley Browne, Jason Cox, and Jonathan
,       Gettler,
title   Resource Cataloging and Distribution System,
ref     University of Tennessee Technical Report CS-97-346,
,       January 1997.
for     We describe an architecture for cataloging the
,       characteristics of Internet-accessible resources, for
,       replicating such resources to improve their accessibility,
,       and for registering the current locations of the resources
,       so replicated.  Message digests and public-key
,       authentication are used to ensure the integrity of the files
,       provided to users.  The service is designed to provide
,       increased functionality with only minimal changes to either
,       a client or a server.  Resources can be named either by URNs
,       or by existing URLs, and either type of resource name can be
,       resolved to a description and ultimately to a set of
,       locations from which the resource can be retrieved.

file    <a href="ut-cs-97-350.ps">ut-cs-97-350.ps</a>
by      Pierre-Yves Calland, Jack Dongarra, and Yves Robert,
title   Tiling with limited resources,
ref     University of Tennessee Technical Report CS-97-350,
,       February 1997.
for     In the framework of perfect loop nests with uniform
,       dependences, tiling has been extensively studied as a
,       source-to-source program transformation.  Little work has
,       been devoted to the mapping and scheduling of the tiles on
,       to physical processors.  We present several new results in
,       the context of limited computational resources, and assuming
,       communication-computation overlap.  In particular, under
,       some reasonable assumptions, we derive the optimal mapping
,       and scheduling of tiles to physical processors.

file    <a href="ut-cs-97-351.ps">ut-cs-97-351.ps</a>
by      Ronald F. Boisvert, Shirley V. Browne, Jack J.
,       Dongarra, Eric Grosse, and Bruce Miller,
title   Interactive and Dynamic Content in Software
,       Repositories,
ref     University of Tennessee Technical Report CS-97-351,
,       February 1997.
for     The goal of our software repository research is to
,       improve access to tools for doing computational science for
,       both expert and non-expert users.  We are exploring the use
,       of emerging Web and network technologies for enhancing
,       repository usability and interactivity.  Technologies such
,       as Java, Inferno/Limbo, and remote execution services can
,       interactively assist users in searching for, selecting, and
,       using scientific software and computational tools.  This
,       paper describes various related prototype experimental
,       interfaces and servides we have developed for traversing a
,       software classification hierarchy, for selection of software
,       and test problems, and for remote execution of library
,       software.  After developing and tesing our research
,       prototypes, we deploy them in working network services
,       useful to the computational science community.

file    <a href="ut-cs-97-354.ps">ut-cs-97-354.ps</a>
by      Erich Strohmaier,
title   Statistical Performance Modeling:  Case Study of the
,       NPB 2.1 Results,
ref     University of Tennessee Technical Report CS-97-354,
,       March 1997.
for     With the results of the version 2.1 a consistent set
,       of performance measurements of the NAS Parallel Benchmarks
,       (NPB) are available.  Unchanged portable MPI code was used
,       for this set of 269 single measurements.  In this study we
,       investigate how this amount of information can be condensed.
,       We present a metholodogy for analyzing performance data not
,       requiring detailed knowledge of the codes.  For this we
,       study several different generic timing models and fit the
,       reported data.  We show that with a joint timing model for
,       all codes and all systems the data can be fitted reasonably
,       well.  This model also contains only a minimal set of free
,       parameters.  This method is usable in all cases where the
,       analysis of results from complex application code benchmarks
,       is necessary.

file    <a href="ut-cs-97-360.ps">ut-cs-97-360.ps</a>
by      Frederic Desprez, Jack Dongarra, Fabrice Rastello,
,       and Yves Robert,
title   Determining the Idle Time of a Tiling: New Results,
ref     University of Tennessee Technical Report CS-97-360,
,       May 1997.
for     In the framework of perfect loop nests with uniform
,       dependencies, tiling has been studied extensively as 
,       a source-to-source program transformation.  We build
,       upon recent results by Hogsted, Carter, and Ferrante
,       [10], who aim at determining the cumulated idle time
,       spent by all processors while executing the partitioned
,       (tiled) computation domain.  We propose new, much 
,       shorter proofs of all their results and extend these 
,       in several important directions.  More precisely, we
,       provide an accurate solution for all values of the
,       rise parameter that relates the shape of the iteration
,       space to that of the tiles, and for all possible
,       distributions of the tiles to processors.   In contrast,
,       the authors in [10] deal only with a limited number of
,       cases and provide upper bounds rather that exact
,       formulas.

file    <a href="ut-cs-97-371.ps">ut-cs-97-371.ps</a>
by      Antoine Petitet
title   Algorithmic Redistribution Methods for Block Cyclic
,       Decompositions
ref     University of Tennessee Technical Report CS-97-371,
,       July 1997.
for     This research aims at creating and providing a frame-
,       work to describe algorithmic redistribution methods
,       for various block cyclic decompositions.  To do so
,       properties of this data distribution scheme are
,       formally exhibited.  The examination of a number of
,       basic dense linear algebra operations illustrates the
,       application of those properties.  This study analyzes
,       the extent to which the general two-dimensional block
,       cyclic data distribution allows for the expression of
,       efficient as well as flexible matrix operations.  This
,       study also quantifies theoretically and practically
,       how much of the efficiency of optimal block cyclic
,       data layouts can be maintained.
,       The general block cyclic decomposition scheme is shown
,       to allow for the expression of flexible basic matrix
,       operations with little impact on the performance and
,       efficiency delivered by optimal and restricted kernels
,       available today.  Second, block cyclic data layouts,
,       such as the purely scattered distribution, which seem
,       less promising as far as performance is concerned, are
,       shown to be able to achieve optimal performance and
,       efficiency for a given set of matrix operations.  Conse-
,       quently, this research not only demonstrates that the
,       restrictions imposed by the optimal block cyclic data
,       layouts can be alleviated, but also that efficiency and
,       flexibility are not antagonistic features of the block
,       cyclic mappings.  These results are particularly relevant
,       to the design of dense linear algebra software libraries
,       as well as to data parallel compiler technology.

file	<a href="overview98.ps">overview98.ps</a>
by	Aad J. van der Steen, Jack Dongarra
title	Overview of recent supercomputers
date	Feb 1998
size	748k
for     In this report we give an overview of parallel an vector
,	computers which are currently available or will become
,	available within a short time frame from vendors; no
,	attempt is made to list all machines that are still in
,	the research phase. The machines are described according
,	to their architectural class.  Shared and
,	distributed-memory SIMD an MIMD machines are discerned.
,	The information about each machine is kept as compact as
,	possible. Moreover, no attempt is made to quote price
,	information as this is often even more elusive than the
,	performance of a system.
</pre>
</body>
</html>