.\" ---------------------- sec8.t ---------------------- .sh 1 "The Compiler" .lp The compiler translates Prolog source files into byte-code object files. It is written entirely in Prolog. The byte code for the compiler can be found in the SB-Prolog system directory \fIcmplib\fR, with the source code resident in \fIcmplib/src\fR. .pp Byte code files may be concatenated together to produce other byte code files. Thus, for example, if \fIfoo1\fR and \fIfoo2\fR are byte code files resulting from the compilation of two Prolog source programs, then the file \fIfoo\fR, obtained by executing the shell command .(l cat foo1 foo2 > foo .)l is a byte code file as well, and may be loaded and executed. In this case, loading and executing the file \fIfoo\fR would give the same result as loading \fIfoo1\fR and \fIfoo2\fR separately, which in turn would be the same as concatenating the original source files and compiling this larger file. This makes it easier to compile large programs: one need only break them into smaller pieces, compile the individual pieces, and concatenate the byte files together. .pp The following sections describe the various aspects of the compiler in more detail. .sh 2 "Structure of the Compiler" .lp The compilation of a Prolog program proceeds in four phases. In the first phase, the source program is read in and passed through a preprocessor. The output of the preprocessor is essentially another Prolog source program. Next, this program is transformed into an internal representation (essentially an annotated abstract syntax tree). The third phase involves generating WAM ``assembly'' code and peephole optimization. Finally, the WAM assembly code is processed by an assembler which generates byte code. .sh 2 "Invoking the Compiler" .lp The compiler is invoked through the Prolog predicate \fIcompile\fR: .(l | ?- compile(\fIInFile\fR [, \fIOutFile\fR ] [, \fIOptionsList\fR ]). .)l where optional parameters are enclosed in brackets. \fIInFile\fR is the name of the input (i.e. source) file; \fIOutFile\fR is the name of the output file (i.e. byte code) file; \fIOptionsList\fR is a list of compiler options (see below). .pp The input and output file names must be Prolog atoms, i.e. either begin with a lower case letter or dollar sign `$', and consist only of letters, digits, and underscores; or, be enclosed within single quotes. If the output file name is not specified, it defaults to \fIInFile\fB.out\fR. The list of options, if specified, is a Prolog list, i.e. a term of the form .(l [ \fIoption\fR\*<1\*>, \fIoption\fR\*<2\*>, ..., \fIoption\*\fR ]. .)l If left unspecified, it defaults to the empty list [\^]. .pp In fact, the output file name and the options list may be specified in any order. Thus, for example, the queries .(l | ?- compile('/usr/debray/foo', foo_\|out, [v]). and | ?- compile('/usr/debray/foo', [v], foo_\|out). .)l are equivalent, and specify that the Prolog source file `\fI/usr/debray/foo\fR' is to be compiled in verbose mode (see ``Compiler Options'' below), and that the byte code is to be generated into the file \fIfoo_out\fR. .pp The \fIcompile\fR predicate may also be called with a fourth parameter: .(l | ?- compile(\fIInFile\fR, \fIOutFile\fR, \fIOptionsList\fR, \fIPredList\fR). .)l where \fIInFile\fR, \fIOutFile\fR and \fIOptionsList\fR are as before; \fIcompile\fR/4 unifies \fIPredList\fR with a list of terms \fIP\fR/\fIN\fR denoting the predicates defined in \fIInFile\fR, where \fIP\fR is a predicate name and \fIN\fR its arity. \fIPredList\fR, if specified, is usually given as an uninstantiated variable; its principal use is for setting trace points on the predicates in the file (see Section 6), e.g. by executing .(l | ?- compile('/usr/debray/foo', foo_\|out, [v], L), load(foo_\|out), trace(L). .)l Notice that \fIPredList\fR can only appear in \fIcompile\fR/4. .sh 2 "Compiler Options" .lp The following options are currently recognized by the compiler: .ip \fBa\fR Specifies that an ``assembler'' file is to be created. The name of the assembler file is obtained by appending ``.asl'' to the source file name. While the writing out of assembly code slows down the compilation process to some extent, it allows the assembler to do a better job of optimizing away indirect subroutine linkages (since in this case the assembler has assembly code for the entire program to work with at once, not just a single predicate). This results in code that is faster and more compact. .ip \fBd\fR Dumps expanded macros to the user (see Section 10). .ip \fBe\fR Expand macros (see Section 10). .ip \fBi\fR Specifies that indexes are to be generated. The pragma file (see Section 14) corresponding to the source file is consulted for information regarding predicates which should have indexes constructed, and the arguments on which indexes are to be constructed. .ip \fBt\fR If specified, turns off assembler optimizations that eliminate indirect branches through the symbol table in favour of direct branches. This is useful in debugging compiled code. It is \fInecessary\fR if the extension table feature is to be used. .ip \fBv\fR If specified, compiles in ``verbose'' mode, which causes messages regarding progress of compilation to be printed out. .sh 2 "A Note on Coding for Efficiency" .pp The SB-Prolog system tends to favour programs that are relatively pure. Thus, for example, \fIassert\fRs tend to be quite expensive, encouraging the user to avoid them if possible. This section points out some syntactic constructs that lead to the generation of efficient code. These involve (\fIi\fR\^) avoiding the creation of backtrack points; and (\fIii\fR\^) minimizing data movement between registers. Optimization of logic programs is an area of ongoing research, and we expect to enhance the capabilities of the system further in future versions. .sh 3 "Avoiding Creation of Backtrack Points" .pp Since the creation of backtrack points is relatively expensive, program efficiency may be improved substantially by using constructs that avoid the creation of backtrack points where possible. The SB-Prolog compiler recognizes conditionals involving certain complementary inline tests, and generates code that does not create choice points for such cases. Two inline tests \fIp\fR(@X bar@) and \fIq\fR(@X bar@) are complementary if and only if \fIp\fR(@X bar@) \(== \fInot\fR(\fIq\fR(@X bar@)). For example, the literals `\fIX\fR > \fIY\fR\^' and `\fIX\fR =< \fIY\fR\^' are complementary. At this point, complementary tests are recognized as such only if their argument tuples are identical. The inline predicates that are treated in this manner, with their corresponding complementary literals, are shown in Table 3. .(z .sp .TS allbox center tab(%); ce ce le le. \fIInline Test%Complementary Test\fR >/2%=/2 >=/2%=/2 =:=/2%=\e=/2 =\e=/2%=:=/2 var/1%nonvar/1 nonvar/1%var/1 .TE .sp .ce Table 3: Complementary tests recognized by the compiler .sp .)z The syntactic constructs recognized are: .ip (\fIi\fR\^) Disjuncts of the form .(l \fIhead\fR :\- (( \fItest\fR(@X bar@), ... ) ; ( not(\fItest\fR(@X bar@)), ...)). .)l or .(l \fIhead\fR :\- (( \fItest\fR(@X bar@), ... ) ; ( \fIcomp_\|test\fR(@X bar@), ...)). .)l where \fItest\fR is one of the inline tests in the table above, and \fIcomp_\|test\fR the corresponding complementary test (note that the arguments to \fItest\fR and \fIcomp_\|test\fR have to be identical). .ip (\fIii\fR\^) Conditionals of the form .(l \fIhead\fR :\- (\fItest\fR\*<1\*>\fB,\fR ... \fB,\fR \fItest\*\fR) \-> \fITrue_\|Case\fR ; \fIFalse_\|Case\fR. .)l or .(l \fIhead\fR :\- (\fItest\fR\*<1\*>\fB;\fR ... \fB;\fR \fItest\*\fR) \-> \fITrue_\|Case\fR ; \fIFalse_\|Case\fR. .)l where each \fItest\*\fR is an inline test, as mentioned in the table above. .pp The code generated for these cases involves a test and conditional branch, and no choice point is created. We expect future versions of the translator to recognize a wider class of complementary tests. .pp Notice that this discourages the use of explicit cuts. For example, whereas a choice point will be created for .(l part(M,[E|L],U1,U2) :\- ((E =< M, !, U1 = [E|U1a], U2 = U2a) ; (U1 = U1a, U2 = [E|U2a])), part(M,L,U1a,U2a). .)l no choice point will be created for either .(l part(M,[E|L],U1,U2) :\- (E =< M \-> (U1 = [E|U1a], U2 = U2a) ; (U1 = U1a, U2 = [E|U2a])), part(M,L,U1a,U2a). .)l or .(l part(M,[E|L],U1,U2) :\- ((E =< M, U1 = [E|U1a], U2 = U2a) ; (E > M, U1 = U1a, U2 = [E|U2a])), part(M,L,U1a,U2a). .)l Thus, either of the two later versions will be more efficient than the version with the explicit cut. .sh 3 "Minimizing Data Movement Between Registers" .pp Data movement between registers for parameter passing may be minimized by leaving variables in the same argument position wherever possible. Thus, the clause .(l p(X,Y) :\- p1(X,Y,0). .)l is preferable to .(l p(X,Y) :\- p1(0,X,Y). .)l because the first definition leaves the variables \fIX\fR and \fIY\fR in the same argument positions (first and second, respectively), while the second definition does not. .sh 2 "Assembly" .pp The SB-Prolog assembler can be invoked by loading the compiler and using the predicate \fI$asm\fR/3: .(l | ?- $asm(\fIInFile\fR, \fIOutFile\fR, \fIOptionsList\fR). .)l where \fIInFile\fR is a Prolog atom which is the name of a WAM assembly source file (e.g. the ``.asl'' file generated when a Prolog program is compiled with the ``a'' option), \fIOutFile\fR is an atom which is the name of the intended byte code file, and \fIOptionsList\fR is a list of options. The options recognized by the assembler are: .ip \fBv\fR ``Verbose'' mode. Prints out information regarding progress of assembly. .ip \fBt\fR ``Trace''. If specified, the assembler generates code to force procedure calls to branch indirectly via the symbol table, instead of using a direct branch. This is Useful for tracing compiled code. It is \fInecessary\fR if the extension table feature is to be used. .lp The assembler is intended primarily to support the compiler, so the assembly language syntax is quirky in places. The interested reader is advised to look at the assembly files resulting from compilation with the ``a'' option for more on SB-Prolog assembler syntax. .\" ---------------------- end of section sec8.t ----------------------