[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

binary distribution issues



Clint,

I have been playing around with ATLAS and have been thinking
about how I would create a binary installation package for it.  I
have a fair amount of experience and expertise building binary
installation packages for HP machines running HP-UX, but I don't
see an easy way to convert ATLAS to a binary package.  (Many
of the arguments hold for other environments, such as Linux where
vendors might want to create binary installation packages for
their distributions.)

The key issues and relevant facts as I see it are:
	- ATLAS really needs to know a fair amount about the
	  system it will eventually run on, such as the cache size
	- ATLAS actually needs to do the experiments on the
	  target machine, not the build machine
	- A single binary distribution may be installed on a wide
	  variety of hardware instances (different cache sizes, 
	  CPU versions, ...)
	- Much of the latency in the tuning process is due to the
	  compilation time
	- ATLAS appears to do a full grid search for optimization
	- HP-UX does not include an ANSI-C compiler by default
	  (it is an add-on product)

I am thinking about the following solution:
	- As part of the packaging process, the system would
	  pre-build binary versions of all parameter combinations
	  so no compilation is necessary on the target machine
	- During the installation process, ATLAS would run a
	  program that does the timing using the pre-compiled
	  routines to find the optimal configuration.  It would
	  then build libatlas.a and optionally liblapack.a.
	- I have been thinking that perhaps ATLAS might further 
	  speed up installation by using an optimization algorithm 
	  to locate the best configuration, rather than a full grid 
	  search.

It is not that I am against building systems from source, since
I do that for nearly all my own software.  Rather, it is that most
people have neither the time nor the skill to build everything from
scratch, and yet it would still be useful if they could realize the
full capabilities of the software they have on their machine.

An example of how this might be useful:  Travis Oliphant has
been assembling a number of RPM packages for various
number crunching packages, such as lapack.  He is building
a MatLab-like environment in Python using these packages.
(http://numpy.sourceforge.net/)  It would be very nice if users
who simply want the high-level environment were able to get
the benefits of ATLAS's fast, tuned, array operations without
having to rebuild it themselves.  

Does this sound interesting to you?  Do you know of anyone 
else working on this problem?  Would you be amenable to 
integrating patches to enable this sort of capability in ATLAS?
Am I missing something which makes this impossible
or undesirable?

Cheers,

Carl Staelin

PS Some quick information on my background:  I am a co-author
    of the lmbench micro-benchmark suite, and I am the author of
    mkpkg which helps automatically generate SD-UX binary
    installation packages.  I also worked very closely with the
    Liverpool Porting and Archive Centre to help them build a
    library of binary installation packages for HP-UX.  My homepage
    is http://www.hpl.hp.com/personal/Carl_Staelin and the
    lmbench home page is http://www.bitmover.com/lmbench.

PPS I have also been thinking that it might be possible to create
    lmbench-like micro-benchmarks to measure more aspects of
    CPU performance to try and predict performance for each
    configuration.  These predictions could then be used to seed
    the search algorithm with likely configurations.  It looks like 
    you already have some benchmarks and I don't know what 
    additional benchmarks might be needed to provide a rough 
    prediction of performance for each grid point.