NetBuild Overview

Introduction.  NetBuild is a suite of tools designed to aid users in making use of computational software libraries that are stored on the network, without needing to have those libraries preinstalled on each user's computer. Instead, the NetBuild client will determine which libraries are not installed, identify suitable versions of those libraries that are accessible from the network, download those libraries, and link them into the user's program.

Since NetBuild is intended for use in high-performance computation, it supports the ability to select between several different versions of a library that might be available, each optimized for the specific characteristics of a target platform.

NetBuild libraries are cryptographically signed and verified in order to deter attempts to modify the downloaded libraries to attack users' systems.

NetBuild is easy to use. The user simply types nb followed by the command that would compile and link the program if the libraries were installed locally. This works for compile or link commands, shell scripts, and Makefiles.

So for instance, instead of typing

f77 program.f -llapack -lblas
the user would type
nb2 f77 program.f -llapack -lblas
Or instead of typing
make program
the user would type
nb2 make program
nb2 then runs the supplied command in an altered environment, which has its PATH variable modified to have a directory prepended to it. That directory contains shims which have the same name as compilers and linkers that nb2 needs to intercept.

Whenever one of those compilers or linkers is invoked - either directly from nb2, by make, or by some other compilation tool, the shim is run instead of the real compiler. The shim then parses the compiler's arguments looking for names of libraries that need to be linked in. If those libraries are not installed on the system, the shim then downloads them, verifies their signatures, and extracts them into an empty directory. Finally the real compiler or linker is then run with a modified argument list that causes the newly-downloaded libraries to be linked in along with the user's program and any native libraries that are used.

System Overview.  The relationship between the various NetBuild tools is shown in Figure 1. The top part of the figure shows how libraries are constructed, the middle shows how they are stored on a server, and the bottom part of the figure shows how they are used by the NetBuild client to construct executables.
Figure 1. NetBuild toolchain

There are two ways of building library packages. The first uses the nc program which attempts to automatically construct metadata appropriate to the target platform. The idea is that the library maintainer can simply type nc make in the top-level source-code directory - and this will produce a library, complete with metadata. The nc-checkin tool can then extract the metadata from the object files, sign the resulting library, and check it into the server.

nc works much like the NetBuild client nb2 in that it intercepts calls to compilers, examines their options, and then invokes the real compiler. After the real compiler has produced its object file, nc then annotates that object file so that it contains metadata that describes the platform for which the object file was compiled. The nc approach is intended to rid the library package maintainer of the burden of manually constructing metadata. However it does not work well on some platforms (because those platforms' linkers make it difficult to annotate existing object files). Also, nc attempts to determine which platform features are required by an object by recognizing compiler options (such as gcc's -msse option to enable use of SSE instructions on IA32 platforms). This is not sufficient with software packages that alter the source code (via conditional compilation or code-building tools) to optimize code for fine-grained target platforms.

The second way of producing a library package is more work, but also more versatile. The nc-checkin-multi tool can accept any number of files of essentailly arbitrary types, combine them into a package, sign the package, and check the package into a server. In this case the library maintainer is responsible for constructing the metadata. But packages constructed with nc-checkin-multi can contain arbitrary files: multiple object libraries, object files, source files, include files, and a script (named post-install.sh) which is run after a package has been downloaded and verified. The post-install.sh script could even be used to build a library "on the fly" from source code, or to download the library from another server, or to convert the library from a different package format such as RPM.

With either nc-checkin or nc-checkin-multi the package produced is in a standard format which is understood by both the netbuild-checkin tool on the server and by the NetBuild client nb2. The netbuild-checkin tool sanity checks the package, extracts the metadata from the package, and copies the package into the appropriate directory from where it can be downloaded. It also maintains various summary information - the list of available packages for human consumption and the per-package netbuild.index files for use by NetBuild clients.

Finally the NetBuild client nb2, having determined that a particular package is desired, downloads the netbuild.index file to obtain a list of candidates for that package, picks the one that seems to best fit the target platform, downloads that package file from the server, and installs it in a local directory. If all goes well with the various packages that must be downloaded, nb2 then calls the normal compiler or linker to compile local source code and link to the various libraries, whether pre-installed or downloaded by nb2.


Last update: 17 May 2004