Chez Scheme Version 9 User’s Guide

Chez Scheme Version 9 User’s Guide
Copyright © 2020 Cisco Systems, Inc.
Licensed under the Apache License Version 2.0 (full copyright notice.).
Revised August 2020 for Chez Scheme Version 9.5.4
about this book

Preface

Chez Scheme is both a general-purpose programming language and an implementation of that language, with supporting tools and documentation. As a superset of the language described in the Revised⁶ Report on Scheme (R6RS), Chez Scheme supports all standard features of Scheme, including first-class procedures, proper treatment of tail calls, continuations, user-defined records, libraries, exceptions, and hygienic macro expansion. Chez Scheme supports numerous non-R6RS features. A few of these are local and top-level modules, local import, foreign datatypes and procedures, nonblocking I/O, an interactive top-level, compile-time values and properties, pretty-printing, and formatted output.

The implementation includes a compiler that generates native code for each processor upon which it runs along with a run-time system that provides automatic storage management, foreign-language interfaces, source-level debugging, profiling support, and an extensive run-time library.

The threaded versions of Chez Scheme support native threads, allowing Scheme programs to take advantage of multiprocessor or multiple-core systems. Nonthreaded versions are also available and are faster for single-threaded applications. Both 32-bit and 64-bit versions are available for some platforms. The 64-bit versions support larger heaps, while the 32-bit versions are faster for some applications.

Chez Scheme's interactive programming system includes an expression editor that, like many shells, supports command-line editing, a history mechanism, and command completion. Unlike most shells that support command-line editing, the expression editor properly supports multiline expressions.

Chez Scheme is intended to be as reliable and efficient as possible, with reliability taking precedence over efficiency if necessary. Reliability means behaving as designed and documented. While a Chez Scheme program can always fail to work properly because of a bug in the program, it should never fail because of a bug in the Chez Scheme implementation. Efficiency means performing at a high level, consuming minimal CPU time and memory. Performance should be balanced across features, across run time and compile time, and across programs and data of different sizes. These principles guide Chez Scheme language and tool design as well as choice of implementation technique; for example, a language feature or debugging hook might not exist in Chez Scheme because its presence would reduce reliability, efficiency, or both.

The compiler has been rewritten for Version 9 and generates substantially faster code than the earlier compiler at the cost of greater compile time. This is the primary difference between Versions 8 and 9.

This book (CSUG) is a companion to The Scheme Programming Language, 4th Edition (TSPL4). TSPL4 serves as an introduction to and reference for R6RS, while CSUG describes Chez Scheme features and tools that are not part of R6RS. For the reader’s convenience, the summary of forms and index at the back of this book contain entries from both books, with each entry from TSPL4 marked with a "t" in front of its page number. In the online version, the page numbers given in the summary of forms and index double as direct links into one of the documents or the other.

Additional documentation for Chez Scheme includes release notes, a manual page, and a number of published papers and articles that describe various aspects of the system’s design and implementation.

Thank you for using Chez Scheme.

Chapter 1. Introduction

This book describes Chez Scheme extensions to the Revised⁶ Report on Scheme [28] (R6RS). It contains as well a concise summary of standard and Chez Scheme forms and procedures, which gives the syntax of each form and the number and types of arguments accepted by each procedure. Details on standard R6RS features can be found in The Scheme Programming Language, 4th Edition (TSPL4) [11] or the Revised⁶ Report on Scheme. The Scheme Programming Language, 4th Edition also contains an extensive introduction to the Scheme language and numerous short and extended examples.

Most of this document also applies equally to Petite Chez Scheme, which is fully compatible with the complete Chez Scheme system but uses a high-speed interpreter in place of Chez Scheme's incremental native-code compiler. Programs written for Chez Scheme run unchanged in Petite Chez Scheme as long as they do not require the compiler to be invoked. In fact, Petite Chez Scheme is built from the same sources as Chez Scheme, with all but the compiler sources included. A detailed discussion of the impact of this distinction appears in Section 2.8.

The remainder of this chapter covers Chez Scheme extensions to Scheme syntax (Section 1.1), notational conventions used in this book (Section 1.2), the use of parameters for system customization (Section 1.3), and where to look for more information on Chez Scheme (Section 1.4).

Chapter 2 describes how one uses Chez Scheme for program development, scripting, and application delivery, plus how to get the compiler to generate the most efficient code possible. Chapter 3 describes debugging and object inspection facilities. Chapter 4 documents facilities for interacting with separate processes or code written in other languages. Chapter 5 describes binding forms. Chapter 6 documents control structures. Chapter 7 documents operations on nonnumeric objects, while Chapter 8 documents various numeric operations, including efficient type-specific operations. Chapter 9 describes input/output operations and generic ports, which allow the definition of ports with arbitrary input/output semantics. Chapter 10 discusses how R6RS libraries and top-level programs are loaded into Chez Scheme along with various features for controlling and tracking the loading process. Chapter 11 describes syntactic extension and modules. Chapter 12 describes system operations, such as operations for interacting with the operating system and customizing Chez Scheme's user interface. Chapter 13 describes how to invoke and control the storage management system and documents guardians and weak pairs. Chapter 14 describes Chez Scheme's expression editor and how it can be customized. Chapter 15 documents the procedures and syntactic forms that comprise the interface to Chez Scheme's native thread system. Finally, Chapter 16 describes various compatibility features.

The back of this book contains a bibliography, the summary of forms, and an index. The page numbers appearing in the summary of forms and the italicized page numbers appearing in the index indicate the locations in the text where forms and procedures are formally defined. The summary of forms and index includes entries from TSPL4, so that they cover the entire set of Chez Scheme features. A TSPL4 entry is marked by a "t" prefix on the page number.

Online versions and errata for this book and for TSPL4 can be found at www.scheme.com.

Acknowledgments: Michael Adams, Mike Ashley, Carl Bruggeman, Bob Burger, Sam Daniel, George Davidson, Matthew Flatt, Aziz Ghuloum, Bob Hieb, Andy Keep, and Oscar Waddell have contributed substantially to the development of Chez Scheme. Chez Scheme's expression editor is based on a command-line editor for Scheme developed from 1989 through 1994 by C. David Boyer. File compression is performed with the use of the lz4 compression library developed by Yann Collet or the zlib compression library developed by Jean-loup Gailly and Mark Adler. Implementations of the list and vector sorting routines are based on Olin Shiver’s opportunistic merge-sort algorithm and implementation. Michael Lenaghan provided a number of corrections for earlier drafts of this book. Many of the features documented in this book were suggested by current Chez Scheme users, and numerous comments from users have also led to improvements in the text. Additional suggestions for improvements to Chez Scheme and to this book are welcome.

Section 1.1. Chez Scheme Syntax

Chez Scheme extends Scheme’s syntax both at the object (datum) level and at the level of syntactic forms. At the object level, Chez Scheme supports additional representations for symbols that contain nonstandard characters, nondecimal numbers expressed in floating-point and scientific notation, vectors with explicit lengths, shared and cyclic structures, records, boxes, and more. These extensions are described below. Form-level extensions are described throughout the book and summarized in the Summary of Forms, which also appears in the back of this book.

Chez Scheme extends the syntax of identifiers in several ways. First, the sequence of characters making up an identifier’s name may start with digits, periods, plus signs, and minus signs as long as the sequence cannot be parsed as a number. For example, 0abc, +++, and .. are all valid identifiers in Chez Scheme. Second, the single-character sequences { and } are identifiers. Third, identifiers containing arbitrary characters may be printed by escaping them with \ or with |. \ is used to escape a single character (except 'x', since \x marks the start of a hex scalar value), whereas | is used to escape the group of characters that follow it up through the matching |. For example, \||\| is an identifier with a two-character name consisting of the character | followed by the character \, and |hit me!| is an identifier whose name contains a space.

In addition, gensyms (Section 7.9) are printed with #{ and } brackets that enclose both the "pretty" and "unique" names, e.g., #{g1426 e5g1c94g642dssw-a}. They may also be printed using the pretty name only with the prefix #:, e.g., #:g1426.

Arbitrary radixes from two through 36 may be specified with the prefix #nr, where n is the radix. Case is not significant, so #nR may be used as well. Digit values from 10 through 35 are specified as either lower- or upper-case alphabetic characters, just as for hexadecimal numbers. For example, #36rZZ is $35 × 36 + 35$, or 1295.

Chez Scheme also permits nondecimal numbers to be printed in floating-point or scientific notation. For example, #o1.4 is equivalent to 1.5, and #b1e10 is equivalent to 4.0. Digits take precedence over exponent specifiers, so that #x1e20 is simply the four-digit hexadecimal number equivalent to 7712.

In addition to the standard named characters #\alarm, #\backspace, #\delete, #\esc, #\linefeed, #\newline, #\page, #\return, #\space, and #\tab, Chez Scheme recognizes #\bel, #\ls, #\nel, #\nul, #\rubout, and #\vt (or #\vtab). Characters whose scalar values are less than 256 may also be printed with an octal syntax consisting of the prefix #\ followed by a three octal-digit sequence. For example, #\000 is equivalent to #\nul.

Chez Scheme's fxvectors, or fixnum vectors, are printed like vectors but with the prefix #vfx( in place of #(. Vectors, bytevectors, and fxvectors may be printed with an explicit length prefix, and when the explicit length prefix is specified, duplicate trailing elements may be omitted. For example, \#(a b c) may be printed as #3(a b c), and a vector of length 100 containing all zeros may be printed as #100(0).

Chez Scheme's boxes are printed with a #& prefix, e.g., #&17 is a box containing the integer 17.

Records are printed with the syntax #[type-name field ...], where the symbol type-name is the name of the record type and field ... are the printed representations for the contents of the fields of the record.

Shared and cyclic structure may be printed using the graph mark and reference prefixes #n= and #n#. #n= is used to mark an item in the input, and #n# is used to refer to the item marked n. For example, '(#1=(a) . #1#) is a pair whose car and cdr contain the same list, and #0=(a . #0#) is a cyclic list, i.e., its cdr is itself.

A $primitive form (see page 358) may be abbreviated in the same manner as a quote form, using the #% prefix. For example, #%car is equivalent to ($primitive car), #2%car to ($primitive 2 car), and #3%car to ($primitive 3 car).

Chez Scheme's end-of-file object is printed #!eof. If the end-of-file object appears outside of any datum within a file being loaded, load will treat it as if it were a true end of file and stop loading at that point. Inserting #!eof into the middle of a file can thus be handy when tracking down a load-time error.

Broken pointers in weak pairs (see page 406) are represented by the broken weak pointer object, which is printed #!bwp.

In addition to the standard delimiters (whitespace, open and close parentheses, open and close brackets, double quotes, semi-colon, and #), Chez Scheme also treats as delimiters open and close braces, single quote, backward quote, and comma.

The Chez Scheme lexical extensions described above are disabled in an input stream after an #!r6rs comment directive has been seen, unless a #!chezscheme comment directive has been seen since. Each library loaded implicitly via import and each RNRS top-level program loaded via the --program command-line option, the scheme-script command, or the load-program procedure is treated as if it begins implicitly with an #!r6rs comment directive.

The case of symbol and character names is normally significant, as required by the Revised⁶ Report. Names are folded, as if by string-foldcase, following a #!fold-case comment directive in the same input stream unless a #!no-fold-case has been seen since. Names are also folded if neither directive has been seen and the parameter case-sensitive has been set to #f.

The printer invoked by write, put-datum, pretty-print, and the format ~s option always prints standard Revised⁶ Report objects using the standard syntax, unless a different behavior is requested via the setting of one of the print parameters. For example, it prints symbols in the extended identifier syntax of Chez Scheme described above using hex scalar value escapes, unless the parameter print-extended-identifiers is set to true. Similarly, it does not print the explicit length or suppress duplicate trailing elements unless the parameter print-vector-length is set to true.

Section 1.2. Notational Conventions

This book follows essentially the same notational conventions as The Scheme Programming Language, 4th Edition. These conventions are repeated below, with notes specific to Chez Scheme.

When the value produced by a procedure or syntactic form is said to be unspecified, the form or procedure may return any number of values, each of which may be any Scheme object. Chez Scheme usually returns a single, unique void object (see void) whenever the result is unspecified; avoid counting on this behavior, however, especially if your program may be ported to another Scheme implementation. Printing of the void object is suppressed by Chez Scheme's waiter (read-evaluate-print loop).

This book uses the words "must" and "should" to describe program requirements, such as the requirement to provide an index that is less than the length of the vector in a call to vector-ref. If the word "must" is used, it means that the requirement is enforced by the implementation, i.e., an exception is raised, usually with condition type &assertion. If the word "should" is used, an exception may or may not be raised, and if not, the behavior of the program is undefined. The phrase "syntax violation" is used to describe a situation in which a program is malformed. Syntax violations are detected prior to program execution. When a syntax violation is detected, an exception of type &syntax is raised and the program is not executed.

Scheme objects are displayed in a typewriter typeface just as they are to be typed at the keyboard. This includes identifiers, constant objects, parenthesized Scheme expressions, and whole programs. An italic typeface is used to set off syntax variables in the descriptions of syntactic forms and arguments in the descriptions of procedures. Italics are also used to set off technical terms the first time they appear. The first letter of an identifier that is not ordinarily capitalized is not capitalized when it appears at the beginning of a sentence. The same is true for syntax variables written in italics.

In the description of a syntactic form or procedure, a pattern shows the syntactic form or the application of the procedure. The syntax keyword or procedure name is given in typewriter font, as are parentheses. The remaining pieces of the syntax or arguments are shown in italics, using names that imply the types of the expressions or arguments expected by the syntactic form or procedure. Ellipses are used to specify zero or more occurrences of a subexpression or argument.

Section 1.3. Parameters

All Chez Scheme system customization is done via parameters. A parameter is a procedure that encapsulates a hidden state variable. When invoked without arguments, a parameter returns the value of the encapsulated variable. When invoked with one argument, the parameter changes the value of the variable to the value of its argument. A parameter may raise an exception if its argument is not appropriate, or it may filter the argument in some way.

New parameters may be created and used by programs running in Chez Scheme. Parameters are used rather than global variables for program customization for two reasons: First, unintentional redefinition of a customization variable can cause unexpected problems, whereas unintentional redefinition of a parameter simply makes the parameter inaccessible. For example, a program that defines *print-level* for its own purposes in early releases of Chez Scheme would have unexpected effects on the printing of Scheme objects, whereas a program that defines print-level for its own purposes simply loses the ability to alter the printer’s behavior. Of course, a program that invokes print-level by accident can still affect the system in unintended ways, but such an occurrence is less likely, and can only happen in an incorrect program.

Second, invalid values for parameters can be detected and rejected immediately when the "assignment" is made, rather than at the point where the first use occurs, when it is too late to recover and reinstate the old value. For example, an assignment of *print-level* to -1 would not have been caught until the first call to write or pretty-print, whereas an attempted assignment of -1 to the parameter print-level, i.e., (print-level -1), is flagged as an error immediately, before the change is actually made.

Built-in system parameters are described in different sections throughout this book and are listed along with other syntactic forms and procedures in the Summary of Forms in the back of this book. Parameters marked "thread parameters" have per-thread values in threaded versions of Chez Scheme, while the values of parameters marked "global parameters" are shared by all threads. Nonthreaded versions of Chez Scheme do not distinguish between thread and global parameters. See Sections 12.13 and 15.7 for more information on creating and manipulating parameters.

Section 1.4. More Information

The articles and technical reports listed below document various features of Chez Scheme and its implementation:

syntactic abstraction ([14],[8],[17]),
modules [32],
libraries [21],
storage management ([12],[13]),
threads [10],
multiple return values [2],
optional arguments [16],
continuations ([7],[25],[3]),
eq? hashtables [20],
internal definitions, letrec, and letrec* ([33],[22]),
equal? [1],
engines [15],
floating-point printing [4],
code generation [18],
register allocation [6],
procedure inlining [31],
profiling [5], and
history of the implementation [9].

Links to abstracts and electronic versions of these publications are available at the url http://www.cs.indiana.edu/chezscheme/pubs/.

Chapter 2. Using Chez Scheme

Chez Scheme is often used interactively to support program development and debugging, yet it may also be used to create stand-alone applications with no interactive component. This chapter describes the various ways in which Chez Scheme is typically used and, more generally, how to get the most out of the system. Sections 2.1, 2.2, and 2.3 describe how one uses Chez Scheme interactively. Section 2.4 discusses how libraries and RNRS top-level programs are used in Chez Scheme. Section 2.5 covers support for writing and running Scheme scripts, including compiled scripts and compiled RNRS top-level programs. Section 2.6 describes how to structure and compile an application to get the most efficient code possible out of the compiler. Section 2.7 describes how one can customize the startup process, e.g., to alter or eliminate the command-line options, to preload Scheme or foreign code, or to run Chez Scheme as a subordinate program of another program. Section 2.8 describes how to build applications using Chez Scheme with Petite Chez Scheme for run-time support. Finally, Section 2.9 covers command-line options used when invoking Chez Scheme.

Section 2.1. Interacting with Chez Scheme

One of the simplest and most effective ways to write and test Scheme programs is to compose them using a text editor, like vi or emacs, and test them interactively with Chez Scheme running in a shell window. When Chez Scheme is installed with default options, entering the command scheme at the shell’s prompt starts an interactive Scheme session. The command petite does the same for Petite Chez Scheme. After entering this command, you should see a short greeting followed by an angle-bracket on a line by itself, like this:

Chez Scheme Version 9.5.1
Copyright 1984-2017 Cisco Systems, Inc.

>

You also should see that the cursor is sitting one space to the right of the angle-bracket. The angle-bracket is a prompt issued by the system’s "REPL," which stands for "Read Eval Print Loop," so called because it reads, evaluates, and prints an expression, then loops back to read, evaluate, and print the next, and so on. (In Chez Scheme, the REPL is also called a waiter.)

In response to the prompt, you can type any Scheme expression. If the expression is well-formed, the REPL will run the expression and print the value. Here are a few examples:

> 3
3
> (+ 3 4)
7
> (cons 'a '(b c d))
(a b c d)

The reader used by the REPL is more sophisticated than an ordinary reader. In fact, it’s a full-blown "expression editor" ("expeditor" for short) like a regular text editor but for just one expression at a time. One thing you might soon notice is that the system automatically indents the second and subsequent lines of an expression. For example, let’s say we want to define fact, a procedure that implements the factorial function. If we type (define fact followed by the enter key, the cursor should be sitting under the first e in define, so that if we then type (lambda (x), we should see:

> (define fact
    (lambda (x)

The expeditor also allows us to move around within the expression (even across lines) and edit the expression to correct mistakes. After typing:

> (define fact
    (lambda (x)
      (if (= n 0)
          0
          (* n (fact

we might notice that the procedure’s argument is named x but we have been referencing it as n. We can move back to the second line using the arrow keys, remove the offending x with the backspace key, and replace it with n.

> (define fact
    (lambda (n)
      (if (= n 0)
          0
          (* n (fact

We can then return to the end of the expression with the arrow keys and complete the definition.

> (define fact
    (lambda (n)
      (if (= n 0)
          0
          (* n (fact (- n 1))))))

Now that we have a complete form with balanced parentheses, if we hit enter with the cursor just after the final parenthesis, the expeditor will send it on to the evaluator. We’ll know that it has accepted the definition when we get another right-angle prompt.

Now we can test our definition by entering, say, (fact 6) in response to the prompt:

> (fact 6)
0

The printed value isn’t what we’d hoped for, since $6!$ is actually 720. The problem, of course, is that the base-case return-value 0 should have been 1. Fortunately, we don’t have to retype the definition to correct the mistake. Instead, we can use the expeditor’s history mechanism to retrieve the earlier definition. The up-arrow key moves backward through the history. In this case, the first up-arrow retrieves (fact 6), and the second retrieves the fact definition.

As we move back through the history, the expression editor shows us only the first line, so after two up arrows, this is all we see of the definition:

> (define fact

We can force the expeditor to show the entire expression by typing ^L (control L, i.e., the control and L keys pressed together):

> (define fact
    (lambda (n)
      (if (= n 0)
          0
          (* n (fact (- n 1))))))

Now we can move to the fourth line and change the 0 to a 1.

> (define fact
    (lambda (n)
      (if (= n 0)
          1
          (* n (fact (- n 1))))))

We’re now ready to enter the corrected definition. If the cursor is on the fourth line and we hit enter, however, it will just open up a new line between the old fourth and fifth lines. This is useful in other circumstances, but not now. Of course, we can work around this by using the arrow keys to move to the end of the expression, but an easier way is to type ^J, which forces the expression to be entered immediately no matter where the cursor is.

Finally, we can bring back (fact 6) with another two hits of the up-arrow key and try it again:

> (fact 6)
720

To exit from the REPL and return back to the shell, we can type ^D or call the exit procedure.

The interaction described above uses just a few of the expeditor’s features. The expeditor’s remaining features are described in the following section.

Running programs may be interrupted by typing the interrupt character (typically ^C). In response, the system enters a debug handler, which prompts for input with a break> prompt. One of several commands may be issued to the break handler (followed by a newline), including

"e": or end-of-file to exit from the handler and continue,
"r": to stop execution and reset to the current café,
"a": to abort Chez Scheme,
"n": to enter a new café (see below),
"i": to inspect the current continuation,
"s": to display statistics about the interrupted program, and
"?": to display a list of these options.

When an exception other than a warning occurs, the default exception handler prints a message that describes the exception to the console error port. If a REPL is running, the exception handler then returns to the REPL, where the programmer can call the debug procedure to start up the debug handler, if desired. The debug handler is similar to the break handler and allows the programmer to inspect the continuation (control stack) of the exception to help determine the cause of the problem. If no REPL is running, as is the case for a script or top-level program run via the --script or --program command-line options, the default exception handler exits from the script or program after printing the message. To allow scripts and top-level programs to be debugged, the default exception handler can be forced via the debug-on-exception parameter or the --debug-on-exception command-line option to invoke debug directly.

Developing a large program entirely in the REPL is unmanageable, and we usually even want to store smaller programs in a file for future use. (The expeditor’s history is saved across Scheme sessions, but there is a limit on the number of items, so it is not a good idea to count on a program remaining in the history indefinitely.) Thus, a Scheme programmer typically creates a file containing Scheme source code using a text editor, such as vi, and loads the file into Chez Scheme to test them. The conventional filename extension for Chez Scheme source files is .ss, but the file can have any extension or even no extension at all. A source file can be loaded during an interactive session by typing (load "path"). Files to be loaded can also be named on the command line when the system is started. Any form that can be typed interactively can be placed in a file to be loaded.

Chez Scheme compiles source forms as it sees them to machine code before evaluating them, i.e., "just in time." In order to speed loading of a large file or group of files, each file can be compiled ahead of time via compile-file, which puts the compiled code into a separate object file. For example, (compile-file "path") compiles the forms in the file path.ss and places the resulting object code in the file path.so. Loading a pre-compiled file is essentially no different from loading the source file, except that loading is faster since compilation has already been done.

When compiling a file or set of files, it is often more convenient to use a shell command than to enter Chez Scheme interactively to perform the compilation. This is easily accomplished by "piping" in the command to compile the file as shown below.

echo '(compile-file "filename")' | scheme -q

The -q option suppresses the system’s greeting messages for more compact output, which is especially useful when compiling numerous files. The single-quote marks surrounding the compile-file call should be left off for Windows shells.

When running in this "batch" mode, especially from within "make" files, it is often desirable to force the default exception handler to exit immediately to the shell with a nonzero exit status. This may be accomplished by setting the reset-handler to abort.

echo '(reset-handler abort) (compile-file "filename")' | scheme -q

One can also redefine the base-exception-handler (Section 12.1) to achieve a similar effect while exercising more control over the format of the messages that are produced.

Section 2.2. Expression Editor

When Chez Scheme is used interactively in a shell window, as described above, or when new-cafe is invoked explicitly from a top-level program or script run via --program or --script, the waiter’s "prompt and read" procedure employs an expression editor that permits entry and editing of single- and multiple-line expressions, automatically indents expressions as they are entered, supports identifier completion outside string constants based on the identifiers defined in the interactive environment, and supports filename completion within string constants. The expression editor also maintains a history of expressions typed during and across sessions and supports tcsh-like history movement and search commands. Other editing commands include simple cursor movement via arrow keys, deletion of characters via backspace and delete, and movement, deletion, and other commands using mostly emacs key bindings.

The expression editor does not run if the TERM environment variable is not set (on Unix-based systems), if the standard input or output files have been redirected, or if the --eedisable command-line option (Section 2.9) has been used. The history is saved across sessions, by default, in the file ".chezscheme_history" in the user’s home directory. The --eehistory command-line option (Section 2.9) can be used to specify a different location for the history file or to disable the saving and restoring of the history file.

Keys for nearly all printing characters (letters, digits, and special characters) are "self inserting" by default. The open parenthesis, close parenthesis, open bracket, and close bracket keys are self inserting as well, but also cause the editor to "flash" to the matching delimiter, if any. Furthermore, when a close parenthesis or close bracket is typed, it is automatically corrected to match the corresponding open delimiter, if any.

Key bindings for other keys and key sequences initially recognized by the expression editor are given below, organized into groups by function. Some keys or key sequences serve more than one purpose depending upon context. For example, tab is used for identifier completion, filename completion, and indentation. Such bindings are shown in each applicable functional group.

Multiple-key sequences are displayed with hyphens between the keys of the sequences, but these hyphens should not be entered. When two or more key sequences perform the same operation, the sequences are shown separated by commas.

Detailed descriptions of the editing commands are given in Chapter 14, which also describes parameters that allow control over the expression editor, mechanisms for adding or changing key bindings, and mechanisms for creating new commands.

Newlines, acceptance, exiting, and redisplay:

enter, ^M

accept balanced entry if used at end of entry;

else add a newline before the cursor and indent

^J

accept entry unconditionally

^O

insert newline after the cursor and indent

^D

exit from the waiter if entry is empty;

else delete character under cursor

^Z

suspend to shell if shell supports job control

^L

redisplay entry

^L-^L

clear screen and redisplay entry

Basic movement and deletion:

leftarrow, ^B

move cursor left

rightarrow, ^F

move cursor right

uparrow, ^P

move cursor up; from top of unmodified entry,

move to preceding history entry.

downarrow, ^N

move cursor down; from bottom of unmodified entry,

move to next history entry

^D

delete character under cursor if entry not empty,

else exit from the waiter

backspace, ^H

delete character before cursor

delete

delete character under cursor

Line movement and deletion:

home, ^A

move cursor to beginning of line

end, ^E

move cursor to end of line

^K, esc-k

delete to end of line or, if cursor is at the end

of a line, join with next line

^U

delete contents of current line

When used on the first line of a multiline entry of which only the first line is displayed, i.e., immediately after history movement, ^U deletes the contents of the entire entry, like ^G (described below).

Expression movement and deletion:

esc-^F

move cursor to next expression

esc-^B

move cursor to preceding expression

esc-]

move cursor to matching delimiter

^]

flash cursor to matching delimiter

esc-^K, esc-delete

delete next expression

esc-backspace, esc-^H

delete preceding expression

Entry movement and deletion:

esc-<

move cursor to beginning of entry

esc->

move cursor to end of entry

^G

delete current entry contents

^C

delete current entry contents; reset to end of history

Indentation:

tab

re-indent current line if identifier/filename prefix

not just entered; else insert completion

esc-tab

re-indent current line unconditionally

esc-q, esc-Q, esc-^Q

re-indent each line of entry

Identifier/filename completion:

tab

insert completion if identifier/filename prefix just

entered; else re-indent current line

tab-tab

show possible identifier/filename completions at end

of identifier/filename just typed, else re-indent

^R

insert next identifier/filename completion

Identifier completion is performed outside of a string constant, and filename completion is performed within a string constant. (In determining whether the cursor is within a string constant, the expression editor looks only at the current line and so can be fooled by string constants that span multiple lines.) If at end of existing identifier or filename, i.e., not one just typed, the first tab re-indents, the second tab inserts identifier completion, and the third shows possible completions.

History movement:

uparrow, ^P

move to preceding entry if at top of unmodified

entry; else move up within entry

downarrow, ^N

move to next entry if at bottom of unmodified

entry; else move down within entry

esc-uparrow, esc-^P

move to preceding entry from unmodified entry

esc-downarrow, esc-^N

move to next entry from unmodified entry

esc-p

search backward through history for given prefix

esc-n

search forward through history for given prefix

esc-P

search backward through history for given string

esc-N

search forward through history for given string

To search, enter a prefix or string followed by one of the search key sequences. Follow with additional search key sequences to search further backward or forward in the history. For example, enter "(define" followed by one or more esc-p key sequences to search backward for entries that are definitions, or "(define" followed by one or more esc-P key sequences for entries that contain definitions.

Word and page movement:

esc-f, esc-F

move cursor to end of next word

esc-b, esc-B

move cursor to start of preceding word

^X-[

move cursor up one screen page

^X-]

move cursor down one screen page

Inserting saved text:

^Y

insert most recently deleted text

^V

insert contents of window selection/paste buffer

Mark operations:

^@, ^space, ^^

set mark to current cursor position

^X-^X

move cursor to mark, leave mark at old cursor position

^W

delete between current cursor position and mark

Command repetition:

esc-^U

repeat next command four times

esc-^U-n

repeat next command n times

Section 2.3. The Interaction Environment

In the language of the Revised⁶ Report, code is structured into libraries and "top-level programs." The Revised⁶ Report does not require an implementation to support interactive use, and it does not specify how an interactive top level should operate, leaving such details up to the implementation.

In Chez Scheme, when one enters definitions or expressions at the prompt or loads them from a file, they operate on an interaction environment, which is a mutable environment that initially holds bindings only for built-in keywords and primitives. It may be augmented by user-defined identifier bindings via top-level definitions. The interaction environment is also referred to as the top-level environment, because it is at the top level for purposes of scoping. Programs entered at the prompt or loaded from a file via load should not be confused with RNRS top-level programs, which are actually more similar to libraries in their behavior. In particular, while the same identifier can be defined multiple times in the interaction environment, to support incremental program development, an identifier can be defined at most once in an RNRS top-level program.

The default interaction environment used for any code that occurs outside of an RNRS top-level program or library (including such code typed at a prompt or loaded from a file) contains all of the bindings of the (chezscheme) library (or scheme module, which exports the same set of bindings). This set contains a number of bindings that are not in the RNRS libraries. It also contains a number of bindings that extend the RNRS counterparts in some way and are thus not strictly compatible with the RNRS bindings for the same identifiers. To replace these with bindings strictly compatible with RNRS, simply import the rnrs libraries into the interaction environment by typing the following into the REPL or loading it from a file:

(import
  (rnrs)
  (rnrs eval)
  (rnrs mutable-pairs)
  (rnrs mutable-strings)
  (rnrs r5rs))

To obtain an interaction environment that contains all and only RNRS bindings, use the following.

(interaction-environment
  (copy-environment
    (environment
      '(rnrs)
      '(rnrs eval)
      '(rnrs mutable-pairs)
      '(rnrs mutable-strings)
      '(rnrs r5rs))
    #t))

To be useful for most purposes, library and import should probably also be included, from the (chezscheme) library.

(interaction-environment
  (copy-environment
    (environment
      '(rnrs)
      '(rnrs eval)
      '(rnrs mutable-pairs)
      '(rnrs mutable-strings)
      '(rnrs r5rs)
      '(only (chezscheme) library import))
    #t))

It might also be useful to include debug in the set of identifiers imported from (chezscheme) to allow the debugger to be entered after an exception is raised.

Most of the identifiers bound in the default interaction environment that are not strictly compatible with the Revised⁶ Report are variables bound to procedures with extended interfaces, i.e., optional arguments or extended argument domains. The others are keywords bound to transformers that extend the Revised⁶ Report syntax in some way. This should not be a problem except for programs that count on exceptions being raised in cases that coincide with the extensions. For example, if a program passes the = procedure a single numeric argument and expects an exception to be raised, it will fail in the initial interaction environment because = returns #t when passed a single numeric argument.

Within the default interaction environment and those created as described above, variables that name built-in procedures are read-only, i.e., cannot be assigned, since they resolve to the read-only bindings exported from the (chezscheme) library or some other library:

(set! cons +) ⇒ exception: cons is immutable

Before assigning a variable bound to the name of a built-in procedure, the programmer must first define the variable. For example,

(define cons-count 0)
(define original-cons cons)
(define cons
  (lambda (x y)
    (set! cons-count (+ cons-count 1))
    (original-cons x y)))

redefines cons to count the number of times it is called, and

(set! cons original-cons)

assigns cons to its original value. Once a variable has been defined in the interaction environment using define, a subsequent definition of the same variable is equivalent to a set!, so

(define cons original-cons)

has the same effect as the set! above. The expression

(import (only (chezscheme) cons))

also binds cons to its original value. It also returns it to its original read-only state.

The simpler redefinition

(define cons (let () (import scheme) cons))

turns cons into a mutable variable with the same value as it originally had. Doing so, however, prevents the compiler from generating efficient code for calls to cons or producing warning messages when cons is passed the wrong number of arguments.

All identifiers not bound in the initial interaction environment and not defined by the programmer are treated as "potentially bound" as variables to facilitate the definition of mutually recursive procedures. For example, assuming that yin and yang have not been defined,

(define yin (lambda () (- (yang) 1)))

defines yin at top level as a variable bound to a procedure that calls the value of the top-level variable yang, even though yang has not yet been defined. If this is followed by

(define yang (lambda () (+ (yin) 1)))

the result is a mutually recursive pair of procedures that, when called, will loop indefinitely or until the system runs out of space to hold the recursion stack. If yang must be defined as anything other than a variable, its definition should precede the definition of yin, since the compiler assumes yang is a variable in the absence of any indication to the contrary when yang has not yet been defined.

A subtle consequence of this useful quirk of the interaction environment is that the procedure free-identifier=? (Section 8.3 of The Scheme Programming Language, 4th Edition) does not consider unbound library identifiers to be equivalent to (as yet) undefined top-level identifiers, even if they have the same name, because the latter are actually assumed to be valid variable bindings.

(library (A) (export a)
  (import (rnrs))
  (define-syntax a
    (lambda (x)
      (syntax-case x ()
        [(_ id) (free-identifier=? #'id #'undefined)]))))
(let () (import (A)) (a undefined)) ⇒ #f

If it is necessary that they have the same binding, as in the case where an identifier is used as an auxiliary keyword in a syntactic abstraction exported from a library and used at top level, the library should define and export a binding for the identifier.

(library (A) (export a aux-a)
  (import (rnrs) (only (chezscheme) syntax-error))
  (define-syntax aux-a
    (lambda (x)
      (syntax-error x "invalid context")))
  (define-syntax a
    (lambda (x)
      (syntax-case x (aux-a)
        [(_ aux-a) #''okay]
        [(_ _) #''oops]))))
(let () (import (A)) (a aux-a)) ⇒ okay
(let () (import (only (A) a)) (a aux-a)) ⇒ oops

This issue does not arise when libraries are used entirely within other libraries or within RNRS top-level programs, since the interaction environment does not come into play.

Section 2.4. Using Libraries and Top-Level Programs

An R6RS library can be defined directly in the REPL, loaded explicitly from a file (using load or load-library), or loaded implicitly from a file via import. When defined directly in the REPL or loaded explicitly from a file, a library form can be used to redefine an existing library, but import never reloads a library once it has been defined.

A library to be loaded implicitly via import must reside in a file whose name reflects the name of the library. For example, if the library’s name is (tools sorting), the base name of the file must be sorting with a valid extension, and the file must be in a directory named tools which itself resides in one of the directories searched by import. The set of directories searched by import is determined by the library-directories parameter, and the set of extensions is determined by the library-extensions parameter.

The values of both parameters are lists of pairs of strings. The first string in each library-directories pair identifies a source-file base directory, and the second identifies the corresponding object-file base directory. Similarly, the first string in each library-extensions pair identifies a source-file extension, and the second identifies the corresponding object-file extension. The full path of a library source or object file consists of the source or object base followed by the components of the library name, separated by slashes, with the library extension added on the end. For example, for base /usr/lib/scheme, library name (app lib1), and extension .sls, the full path is /usr/lib/scheme/app/lib1.sls. So, if (library-directories) contains the pathnames "/usr/lib/scheme/libraries" and ".", and (library-extensions) contains the extensions .ss and .sls, the path of the (tools sorting) library must be one of the following.

/usr/lib/scheme/libraries/tools/sorting.ss
/usr/lib/scheme/libraries/tools/sorting.sls
./tools/sorting.ss
./tools/sorting.sls

When searching for a library, import first constructs a partial name from the list of components in the library name, e.g., a/b for library (a b). It then searches for the partial name in each pair of base directories, in order, trying each of the source extensions then each of the object extensions in turn before moving onto the next pair of base directories. If the partial name is an absolute pathname, e.g., ~/.myappinit for a library named (~/.myappinit), only the specified absolute path is searched, first with each source extension, then with each object extension. If the expander finds both a source file and its corresponding object file, and the object file is not older than the source file, the expander loads the object file. If the object file does not exist, if the object file is older, or if after loading the object file, the expander determines it was built using a library or include file that has changed, the source file is loaded or compiled, depending on the value of the parameter compile-imported-libraries. If compile-imported-libraries is set to #t, the expander compiles the library via the value of the compile-library-handler parameter, which by default calls compile-library (which is described below). Otherwise, the expander loads the source file. (Loading the source file actually causes the code to be compiled, assuming the default value of current-eval, but the compiled code is not saved to an object file.) An exception is raised during this process if a source or object file exists but is not readable or if an object file cannot be created.

The search process used by the expander when processing an import for a library that has not yet been loaded can be monitored by setting the parameter import-notify to #t. This parameter can be set from the command line via the --import-notify command-line option.

Whenever the expander determines it must compile a library to a file or load one from source, it adds the directory in which the file resides to the front of the source-directories list while compiling or loading the library. This allows a library to include files stored in or relative to its own directory.

When import compiles a library as described above, it does not also load the compiled library, because this would cause portions of library to be reevaluated. Because of this, run-time expressions in the file outside of a library form will not be evaluated. If such expressions are present and should be evaluated, the library should be compiled ahead of time or loaded explicitly.

A file containing a library may be compiled with compile-file or compile-library. The only difference between the two is that the latter treats the source file as if it were prefixed by an implicit #!r6rs, which disables Chez Scheme lexical extensions unless an explicit #!chezscheme marker appears in the file. Any libraries upon which the library depends must be compiled first. If one of the libraries imported by the library is subsequently recompiled (say because it was modified), the importing library must also be recompiled. Compilation and recompilation of imported libraries must be done explicitly by default but is done automatically when the parameter compile-imported-libraries is set to #t before compiling the importing library.

As with compile-file, compile-library can be used in "batch" mode via a shell command:

echo '(compile-library "filename")' | scheme -q

with single-quote marks surrounding the compile-library call omitted for Windows shells.

An RNRS top-level-program usually resides in a file, but one can also enter one directly into the REPL using the top-level-program forms, e.g.:

(top-level-program
  (import (rnrs))
  (display "What's up?\n"))

A top-level program stored in a file does not have the top-level-program wrapper, so the same top-level program in a file is just:

(import (rnrs))
(display "What's up?\n")

A top-level program stored in a file can be loaded from the file via the load-program procedure. A top-level program can also be loaded via load, but not without affecting the semantics. A program loaded via load is scoped at top level, where it can see all top-level bindings, whereas a top-level program loaded via load-program is self-contained, i.e., it can see only the bindings made visible by the leading import form. Also, the variable bindings in a program loaded via load also become top-level bindings, whereas they are local to the program when the program is loaded via load-program. Moreover, load-program, like load-library, treats the source file as if it were prefixed by an implicit #!r6rs, which disables Chez Scheme lexical extensions unless an explicit #!chezscheme marker appears in the file. A program loaded via load is also likely to be less efficient. Since the program’s variables are not local to the program, the compiler must assume they could change at any time, which inhibits many of its optimizations.

Top-level programs may be compiled using compile-program, which is like compile-file but, as with load-program, properly implements the semantics and lexical restrictions of top-level programs. compile-program also copies the leading #! line, if any, from the source file to the object file, resulting in an executable object file. Any libraries upon which the top-level program depends, other than built-in libraries, must be compiled first. The program must be recompiled if any of the libraries upon which it depends are recompiled. Compilation and recompilation of imported libraries must be done explicitly by default but is done automatically when the parameter compile-imported-libraries is set to #t before compiling the importing library.

As with compile-file and compile-library, compile-program can be used in "batch" mode via a shell command:

echo '(compile-program "filename")' | scheme -q

with single-quote marks surrounding the compile-program call omitted for Windows shells.

compile-program returns a list of libraries directly invoked by the compiled top-level program. When combined with the library-requirements and library-object-filename procedures, the list of libraries returned by compile-program can be used to determine the set of files that must be distributed with the compiled program file.

When run, a compiled program automatically loads the run-time code for each library upon which it depends, as if via revisit. If the program also imports one of the same libraries at run time, e.g., via the environment procedure, the system will attempt to load the compile-time information from the same file. The compile-time information can also be loaded explicitly from the same or a different file via load or visit.

Section 2.5. Scheme Shell Scripts

When the --script command-line option is present, the named file is treated as a Scheme shell script, and the command-line is made available via the parameter command-line. This is primarily useful on Unix-based systems, where the script file itself may be made executable. To support executable shell scripts, the system ignores the first line of a loaded script if it begins with #! followed by a space or forward slash. For example, assuming that the Chez Scheme executable has been installed as /usr/bin/scheme, the following script prints its command-line arguments.

#! /usr/bin/scheme --script
(for-each
  (lambda (x) (display x) (newline))
  (cdr (command-line)))

The following script implements the traditional Unix echo command.

#! /usr/bin/scheme --script
(let ([args (cdr (command-line))])
  (unless (null? args)
    (let-values ([(newline? args)
                  (if (equal? (car args) "-n")
                      (values #f (cdr args))
                      (values #t args))])
      (do ([args args (cdr args)] [sep "" " "])
          ((null? args))
        (printf "~a~a" sep (car args)))
      (when newline? (newline)))))

Scripts may be compiled using compile-script, which is like compile-file but differs in that it copies the leading #! line from the source-file script into the object file.

If Petite Chez Scheme is installed, but not Chez Scheme, /usr/bin/scheme may be replaced with /usr/bin/petite.

The --program command-line option is like --script except that the script file is treated as an RNRS top-level program (Chapter 10). The following RNRS top-level program implements the traditional Unix echo command, as with the script above.

#! /usr/bin/scheme --program
(import (rnrs))
(let ([args (cdr (command-line))])
  (unless (null? args)
    (let-values ([(newline? args)
                  (if (equal? (car args) "-n")
                      (values #f (cdr args))
                      (values #t args))])
      (do ([args args (cdr args)] [sep "" " "])
          ((null? args))
        (display sep)
        (display (car args)))
      (when newline? (newline)))))

Again, if only Petite Chez Scheme is installed, /usr/bin/scheme may be replaced with /usr/bin/petite.

scheme-script may be used in place of scheme --program or petite --program, i.e.,

#! /usr/bin/scheme-script

scheme-script runs Chez Scheme, if available, otherwise Petite Chez Scheme.

It is also possible to use /usr/bin/env, as recommended in the Revised⁶ Report nonnormative appendices, which allows scheme-script to appear anywhere in the user’s path.

#! /usr/bin/env scheme-script

If a top-level program depends on libraries other than those built into Chez Scheme, the --libdirs option can be used to specify which source and object directories to search. Similarly, if a library upon which a top-level program depends has an extension other than one of the standard extensions, the --libexts option can be used to specify additional extensions to search.

These options set the corresponding Chez Scheme parameters library-directories and library-extensions, which are described in Section 2.4. The format of the arguments to --libdirs and --libexts is the same: a sequence of substrings separated by a single separator character. The separator character is a colon (:), except under Windows where it is a semi-colon (;). Between single separators, the source and object strings, if both are specified, are separated by two separator characters. If a single separator character appears at the end of the string, the specified pairs are added to the front of the existing list; otherwise, the specified pairs replace the existing list.

For example, where the separator is a colon,

scheme --libdirs "/home/moi/lib:"

adds the source/object directory pair

("/home/moi/lib" . "/home/moi/lib")

to the front of the default set of library directories, and

scheme --libdirs "/home/moi/libsrc::/home/moi/libobj:"

adds the source/object directory pair

("/home/moi/libsrc" . "/home/moi/libobj")

to the front of the default set of library directories. The parameters are set after all boot files have been loaded.

If no --libdirs option appears and the CHEZSCHEMELIBDIRS environment variable is set, the string value of CHEZSCHEMELIBDIRS is treated as if it were specified by a --libdirs option. Similarly, if no --libexts option appears and the CHEZSCHEMELIBEXTS environment variable is set, the string value of CHEZSCHEMELIBEXTS is treated as if it were specified by a --libexts option.

Section 2.6. Optimization

To get the most out of the Chez Scheme compiler, it is necessary to give it a little bit of help. The most important assistance is to avoid the use of top-level (interaction-environment) bindings. Top-level bindings are convenient and appropriate during program development, since they simplify testing, redefinition, and tracing (Section 3.1) of individual procedures and syntactic forms. This convenience comes at a sizable price, however.

The compiler can propagate copies (of one variable to another or of a constant to a variable) and inline procedures bound to local, unassigned variables within a single top-level expression. For the procedures it does not inline, it can avoid constructing and passing unneeded closures, bypass argument-count checks, branch to the proper entry point in a case-lambda, and build rest arguments (more efficiently) on the caller side, where the length of the rest list is known at compile time. It can also discard the definitions of unreferenced variables, so there’s no penalty for including a large library of routines, only a few of which are actually used.

It cannot do any of this with top-level variable bindings, since the top-level bindings can change at any time and new references to those bindings can be introduced at any time.

Fortunately, it is easy to restructure a program to avoid top-level bindings. This is naturally accomplished for portable code by placing the code into a single RNRS top-level program or by placing a portion of the code in a top-level program and the remainder in one or more separate libraries. Although not portable, one can also put all of the code into a single top-level module form or let expression, perhaps using include to bring in portions of the code from separate files. The compiler performs some optimization even across library boundaries, so the penalty for breaking a program up in this manner is generally acceptable. The compiler also supports whole-program optimization (via compile-whole-program), which can be used to eliminate all overhead for placing portions of a program into separate libraries.

Once an application’s code has been placed into a single top-level program or into a top-level program and one or more libraries, the code can be loaded from source via load-program or compiled via compile-program and compile-library, as described in Section 2.4. Be sure not to use compile-file for the top-level program since this does not preserve the semantics nor result in code that is as efficient.

With an application structured as a single top-level program or as a top-level program and one or more libraries that do not interact frequently, we have done most of what can be done to help the compiler, but there are still a few more things we can do.

First, we can allow the compiler to generate "unsafe" code, i.e., allow the compiler to generate code in which the usual run-time type checks have been disabled. We do this by using the compiler’s "optimize level 3" when compiling the program and library files. This can be accomplished by setting the parameter optimize-level to 3 while compiling the library or program, e.g.:

(parameterize ([optimize-level 3]) (compile-program "filename"))

or in batch mode via the --optimize-level command-line option:

echo '(compile-program "filename")' | scheme -q --optimize-level 3

It may also be useful to experiment with some of the other compiler control parameters and also with the storage manager’s run-time operation. The compiler-control parameters, including optimize-level, are described in Section 12.6, and the storage manager control parameters are described in Section 13.1.

Finally, it is often useful to "profile" your code to determine that parts of the code that are executed most frequently. While this will not help the system optimize your code, it can help you identify "hot spots" where you need to concentrate your own hand-optimization efforts. In these hot spots, consider using more efficient operators, like fixnum or flonum operators in place of generic arithmetic operators, and using explicit loops rather than nested combinations of linear list-processing operators like append, reverse, and map. These operators can make code more readable when used judiciously, but they can slow down time-critical code.

Section 12.7 describes how to use the compiler’s support for automatic profiling. Be sure that profiling is not enabled when you compile your production code, since the code introduced into the generated code to perform the profiling adds significant run-time overhead.

Section 2.7. Customization

Chez Scheme and Petite Chez Scheme are built from several subsystems: a "kernel" encapsulated in a static or shared library (dynamic link library) that contains operating-system interface and low-level storage management code, an executable that parses command-line arguments and calls into the kernel to initialize and run the system, a base boot file (petite.boot) that contains the bulk of the run-time library code, and an additional boot file (scheme.boot), for Chez Scheme only, that contains the compiler.

While the kernel and base boot file are essential to the operation of all programs, the executable may be replaced or even eliminated, and the compiler boot file need be loaded only if the compiler is actually used. In fact, the compiler is typically not loaded for distributed applications unless the application creates and executes code at run time.

The kernel exports a set of entry points that are used to initialize the Scheme system, load boot or heap files, run an interactive Scheme session, run script files, and deinitialize the system. In the threaded versions of the system, the kernel also exports entry points for activating, deactivating, and destroying threads. These entry points may be used to create your own executable image that has different (or no) command-line options or to run Scheme as a subordinate program within another program, i.e., for use as an extension language.

These entry points are described in Section 4.8, along with other entry points for accessing and modifying Scheme data structures and calling Scheme procedures.

The file main.c in the 'c' subdirectory contains the "main" routine for the distributed executable image; look at this file to gain an understanding of how the system startup entry points are used.

Section 2.8. Building and Distributing Applications

Although useful as a stand-alone Scheme system, Petite Chez Scheme was conceived as a run-time system for compiled Chez Scheme applications. The remainder of this section describes how to create and distribute such applications using Petite Chez Scheme. It begins with a discussion of the characteristics of Petite Chez Scheme and how it compares with Chez Scheme, then describes how to prepare application source code, how to build and run applications, and how to distribute them.

Petite Chez Scheme Characteristics. Although interpreter-based, Petite Chez Scheme evaluates Scheme source code faster than might be expected. Some of the reasons for this are listed below.

The run-time system is fully compiled, so library implementations of primitives ranging from + and car to sort and printf are just as efficient as in Chez Scheme, although they cannot be open-coded as in code compiled by Chez Scheme.
The interpreter is itself a compiled Scheme application. Because it is written in Scheme, it directly benefits from various characteristics of Scheme that would have to be dealt with explicitly and with additional overhead in most other languages, including proper treatment of tail calls, first-class procedures, automatic storage management, and continuations.
The interpreter employs a preprocessor that converts the code into a form that can be interpreted efficiently. In fact, the preprocessor shares its front end with the compiler, and this front end performs a variety of source-level optimizations.

Nevertheless, compiled code is still more efficient for most applications. The difference between the speed of interpreted and compiled code varies significantly from one application to another, but often amounts to a factor of five and sometimes to a factor of ten or more.

Several additional limitations result from the fact that Petite Chez Scheme does not include the compiler:

The compiler must be present to process foreign-procedure and foreign-callable expressions, even when these forms are evaluated by the interpreter. These forms cannot be processed by the interpreter alone, so they cannot appear in source code to be processed by Petite Chez Scheme. Compiled versions of foreign-procedure and foreign-callable forms may, however, be included in compiled code loaded into Petite Chez Scheme.
Inspector information is attached to code objects, which are generated only by the compiler, so source information and variable names are not available for interpreted procedures or continuations into interpreted procedures. This makes the inspector less effective for debugging interpreted code than it is for debugging compiled code.
Procedure names are also attached to code objects, so while the compiler associates a name with each procedure when an appropriate name can be determined, the interpreter does not do so. This mostly impacts the quality of error messages, e.g., an error message might read " incorrect number of arguments to #<procedure> " rather than the likely more useful " incorrect number of arguments to #<procedure name> ".
The compiler detects, at compile time, some potential errors that the interpreter does not detect and reports them via compile-time warnings that identify the expression or the location in the source file, if any, where the expression appears.
Automatic profiling cannot be enabled for interpreted code as it is for compiled code when compile-profile is set to #t.

Except as noted above, Petite Chez Scheme does not restrict what programs can do, and like Chez Scheme, it places essentially no limits on the size of programs or the memory images they create, beyond the inherent limitations of the underlying hardware or operating system.

Compiled scripts and programs. One simple mechanism for distributing an application is to structure it as a script or RNRS top-level program, use compile-script or compile-program, as appropriate to compile it as described in Section 2.5, and distribute the resulting object file along with a complete distribution of Petite Chez Scheme. When this mechanism is used on Unix-based systems, if the source file begins with #! and the path that follows is the path to the Chez Scheme executable, e.g., /usr/bin/scheme, the one at the front of the object file should be replaced with the path to the Petite Chez Scheme executable, e.g., /usr/bin/petite. The path may have to be adjusted by the application’s installation program based on where Petite Chez Scheme is installed on the target system. When used under Windows, the application’s installation program should set up an appropriate shortcut that starts Petite Chez Scheme with the --script or --program option, as appropriate, followed by the path to the object file.

The remainder of this section describes how to distribute applications that do not require Petite Chez Scheme to be installed as a stand-alone system on the target machine.

Preparing Application Code. While it is possible to distribute applications in source-code form, i.e., as a set of Scheme source files to be loaded into Petite Chez Scheme by the end user, distributing compiled code has two major advantages over distributing source code. First, compiled code is usually much more efficient, as discussed in the preceding section, and second, compiled code is in binary form and thus provides more protection for proprietary application code.

Application source code generally consists of a set of Scheme source files possibly augmented by foreign code developed specifically for the application and packaged in shared libraries (also known as shared objects or, on Windows, dynamic link libraries). The following assumes that any shared-library source code has been converted into object form; how to do this varies by platform. (Some hints are given in Section 4.6.) The result is a set of one or more shared libraries that are loaded explicitly by the Scheme source code during program initialization.

Once the shared libraries have been created, the next step is to compile the Scheme source files into a set of Scheme object files. Doing so typically involves simply invoking compile-file, compile-library, or compile-program, as appropriate, on each source file to produce the corresponding object file. This may be done within a build script or "make" file via a command line such as the following:

echo '(compile-file "filename")' | scheme

which produces the object file filename.so from the source file filename.ss.

If the application code has been developed interactively or is usually loaded directly from source, it may be necessary to make some adjustments to a file to be compiled if the file contains expressions or definitions that affect the compilation of subsequent forms in the file. This can be accomplished via eval-when (Section 12.4). This is not typically necessary or desirable if the application consists of a set of RNRS libraries and programs.

You may also wish to disable generation of inspector information both to reduce the size of the compiled application code and to prevent others from having access to the expanded source code that is retained as part of the inspector information. To do so, set the parameter generate-inspector-information to #f while compiling each file The downside of disabling inspector information is that the information will not be present if you need to debug your application, so it is usually desirable to disable inspector information only for production builds of your application. An alternative is to compile the code with inspector information enabled and strip out the debugging information later with strip-fasl-file.

The Scheme startup procedure determines what the system does when it is started. The default startup procedure loads the files listed on the command line (via load) and starts up a new café, like this.

(lambda fns (for-each load fns) (new-cafe))

The startup procedure may be changed via the parameter scheme-start. The following example demonstrates the installation of a variant of the default startup procedure that prints the name of each file before loading it.

(scheme-start
  (lambda fns
    (for-each
      (lambda (fn)
        (printf "loading ~a ..." fn)
        (load fn)
        (printf "~%"))
      fns)
    (new-cafe)))

A typical application startup procedure would first invoke the application’s initialization procedure(s) and then start the application itself:

(scheme-start
  (lambda fns
    (initialize-application)
    (start-application fns)))

Any shared libraries that must be present during the running of an application must be loaded during initialization. In addition, all foreign procedure expressions must be executed after the shared libraries are loaded so that the addresses of foreign routines are available to be recorded with the resulting foreign procedures. The following demonstrates one way in which initialization might be accomplished for an application that links to a foreign procedure show_state in the Windows shared library state.dll:

(define show-state)

(define app-init
  (lambda ()
    (load-shared-object "state.dll")
    (set! show-state
      (foreign-procedure "show_state" (integer-32)
        integer-32))))

(scheme-start
  (lambda fns
    (app-init)
    (app-run fns)))

Building and Running the Application. Building and running an application is straightforward once all shared libraries have been built and Scheme source files have been compiled to object code.

Although not strictly necessary, we suggest that you concatenate your object files, if you have more than one, into a single object file via the concatenate-object-files procedure. Placing all of the object code into a single file simplifies both building and distribution of applications.

For top-level programs with separate libraries, compile-whole-program can be used to produce a single, fully optimized object file. Otherwise, when concatenating object files, put each library after the libraries it depends upon, with the program last.

With the Scheme object code contained within a single composite object file, it is possible to run the application simply by loading the composite object file into Petite Chez Scheme, e.g.:

petite app.so

where app.so is the name of the composite object file, and invoking the startup procedure to restart the system:

> ((scheme-start))

The point of setting scheme-start, however, is to allow the set of object files to be converted into a boot file. Boot files are loaded during the process of building the initial heap. Because of this, boot files have the following advantages over ordinary object files.

Any code and data structures contained in the boot file or created while it is loaded is automatically compacted along with the base run-time library code and made static. Static code and data are never collected by the storage manager, so garbage collection overhead is reduced. (It is also possible to make code and data static explicitly at any time via the collect procedure.)
The system looks for boot files automatically in a set of standard directories based on the name of the executable image, so you can install a copy of the Petite Chez Scheme executable image under your application’s name and spare your users from supplying any command-line arguments or running a separate script to load the application code.

When an application is packaged into a boot file, the source code that is compiled and converted into a boot file should set scheme-start to a procedure that starts the application, as shown in the example above. The application should not be started directly from the boot file, because boot files are loaded before final initialization of the Scheme system. The value of scheme-start is invoked automatically after final initialization.

A boot file is simply an object file containing the code for one or more source files, prefixed by a boot header. The boot header identifies a base boot file upon which the application directly depends, or possibly two or more alternatives upon which the application can be run. In most cases, petite.boot will be identified as the base boot file, but in a layered application it may be another boot file of your creation that in turn depends upon petite.boot. The base boot file, and its base boot file, if any, are loaded automatically when your application boot file is loaded.

Boot files are created with make-boot-file. This procedure accepts two or more arguments. The first is a string naming the file into which the boot header and object code should be placed, the second is a list of strings naming base boot files, and the remainder are strings naming input files. For example, the call:

(make-boot-file "app.boot" '("petite") "app1.so" "app2.ss" "app3.so")

creates the boot file app.boot that identifies a dependency upon petite.boot and contains the object code for app1.so, the object code resulting from compiling app2.ss, and the object code for app3.so. The call:

(make-boot-file "app.boot" '("scheme" "petite") "app.so")

creates a header file that identifies a dependency upon either scheme.boot or petite.boot, with the object code from app.so. In the former case, the system will automatically load petite.boot when the application boot file is loaded, and in the latter it will load scheme.boot if it can find it, otherwise petite.boot. This would allow your application to run on top of the full Chez Scheme if present, otherwise Petite Chez Scheme.

In most cases, you can construct your application so it does not depend upon features of scheme.boot (specifically, the compiler) by specifying only "petite" in the call to make-boot-file. If your application calls eval, however, and you wish to allow users to be able to take advantage of the faster execution speed of compiled code, then specifying both "scheme" and "petite" is appropriate.

Here is how we might create and run a simple "echo" application from a Linux shell:

echo '(suppress-greeting #t)' > myecho.ss
echo '(scheme-start (lambda fns (printf "~{~a~^ ~}\n" fns)))' >> myecho.ss
echo '(compile-file "myecho.ss") \
      (make-boot-file "myecho.boot" (quote ("petite")) "myecho.so")' \
     | scheme -q
scheme -b myecho.boot hello world

If we take the extra step of installing a copy of the Petite Chez Scheme executable as myecho and copying myecho.boot into the same directory as petite.boot (or set SCHEMEHEAPDIRS to include the directory containing myecho.boot), we can simply invoke myecho to run our echo application:

myecho hello world

Distributing the Application. Distributing an application involves can be as simple as creating a distribution package that includes the following items:

the Petite Chez Scheme distribution,
the application boot file,
any application-specific shared libraries,
an application installation script.

The application installation script should install Petite Chez Scheme if not already installed on the target system. It should install the application boot file in the same directory as the Petite Chez Scheme boot file petite.boot is installed, and it should install the application shared libraries, if any, either in the same location or in a standard location for shared libraries on the target system. It should also create a link to or copy of the Petite Chez Scheme executable under the name of your application, i.e., the name given to your application boot file. Where appropriate, it should also install desktop and start-menu shortcuts to run the executable.

Section 2.9. Command-Line Options

Chez Scheme recognizes the following command-line options.

-q, --quiet

suppress greeting and prompt

--script path

run as shell script

--program path

run rnrs top-level program as shell script

--libdirs dir:...

set library directories

--libexts ext:...

set library extensions

--compile-imported-libraries

compile libraries before loading

--import-notify

enable import search messages

--optimize-level 0 | 1 | 2 | 3

set initial optimize level

--debug-on-exception

on uncaught exception, call debug

--eedisable

disable expression editor

--eehistory off | path

expression-editor history file

--enable-object-counts

have collector maintain object counts

--retain-static-relocation

keep reloc info for compute-size, etc.

-b path, --boot path

load boot file

--verbose

trace boot-file search process

--version

print version and exit

--help

print help and exit

--

pass through remaining args

The following options are recognized but cause the system to print an error message and exit because saved heaps are no longer supported.

-h path, --heap path

load heap file

-s[n] path, --saveheap[n] path

save heap file

-c, --compact

toggle compaction flag

With the default scheme-start procedure (Section 2.8), any remaining command-line arguments are treated as the names of files to be loaded before Chez Scheme begins interacting with the user, unless the --script or --program is present, in which case the remaining arguments are made available to the script via the command-line parameter (Section 2.1).

Most of the options are described elsewhere in this chapter, and a few are self-explanatory. The remainder pertain to the loading of boot files at system start-up time and are described below.

When Chez Scheme is run, it looks for one or more boot files to load. Boot files contain the compiled Scheme code that implements most of the Scheme system, including the interpreter, compiler, and most libraries. Boot files may be specified explicitly on the command line via -b options or implicitly. In the simplest case, no -b options are given and the necessary boot files are loaded automatically based on the name of the executable.

For example, if the executable name is "frob", the system looks for "frob.boot" in a set of standard directories. It also looks for and loads any subordinate boot files required by "frob.boot".

Subordinate boot files are also loaded automatically for the first boot file explicitly specified via the command line. Each boot file must be listed before those that depend upon it.

The --verbose option may be used to trace the file searching process and must appear before any boot arguments for which search tracing is desired.

Ordinarily, the search for boot files is limited to a set of installation directories, but this may be overridden by setting the environment variable SCHEMEHEAPDIRS. SCHEMEHEAPDIRS should be a colon-separated list of directories, listed in the order in which they should be searched. Within each directory, the two-character escape sequence %v is replaced by the current version, and the two-character escape sequence %m is replaced by the machine type. A percent followed by any other character is replaced by the second character; in particular, %% is replaced by %, and %: is replaced by :. If SCHEMEHEAPDIRS ends in a non-escaped colon, the default directories are searched after those in SCHEMEHEAPDIRS; otherwise, only those listed in SCHEMEHEAPDIRS are searched.

Under Windows, semi-colons are used in place of colons, and one additional escape is recognized: %x, which is replaced by the directory in which the executable file resides. The default search path under Windows consists of %x and %x\..\..\boot\%m. The registry key HeapSearchPath in HKLM\SOFTWARE\Chez Scheme\csvversion, where version is the Chez Scheme version number, e.g., 7.9.4, can be set to override the default search path, and the SCHEMEHEAPDIRS environment variable overrides both the default and the registry setting, if any.

Boot files consist of ordinary compiled code and consist of a boot header and the compiled code for one or more source files. See Section 2.8 for instructions on how to create boot files.

Chapter 3. Debugging

Chez Scheme has several features that support debugging. In addition to providing error messages when fully type-checked code is run, Chez Scheme also permits tracing of procedure calls, interruption of any computation, redefinition of exception and interrupt handlers, and inspection of any object, including the continuations of exceptions and interrupts.

Programmers new to Scheme or Chez Scheme, and even more experienced Scheme programmers, might want to consult the tutorial "How to Debug Chez Scheme Programs." HTML and PDF versions are available at http://www.cs.indiana.edu/chezscheme/debug/.

Section 3.1. Tracing

Tracing is one of the most useful mechanisms for debugging Scheme programs. Chez Scheme permits any primitive or user-defined procedure to be traced. The trace package prints the arguments and return values for each traced procedure with a compact indentation mechanism that shows the nesting depth of calls. The distinction between tail calls and nontail calls is reflected properly by an increase in indentation for nontail calls only. For nesting depths of 10 or greater, a number in brackets is used in place of indentation to signify nesting depth.

This section covers the mechanisms for tracing procedures and controlling trace output.

syntax	`(trace-lambda name formals body₁ body₂ ...)`
returns	a traced procedure
libraries	`(chezscheme)`

A trace-lambda expression is equivalent to a lambda expression with the same formals and body except that trace information is printed to the trace output port whenever the procedure is invoked, using name to identify the procedure. The trace information shows the value of the arguments passed to the procedure and the values returned by the procedure, with indentation to show the nesting of calls.

The traced procedure half defined below returns the integer quotient of its argument and 2.

(define half
  (trace-lambda half (x)
    (cond
      [(zero? x) 0]
      [(odd? x) (half (- x 1))]
      [(even? x) (+ (half (- x 1)) 1)])))

A trace of the call (half 5), which returns 2, is shown below.

|(half 5)
|(half 4)
| (half 3)
| (half 2)
| |(half 1)
| |(half 0)
| |0
| 1
|2

This example highlights the proper treatment of tail and nontail calls by the trace package. Since half tail calls itself when its argument is odd, the call (half 4) appears at the same level of indentation as the call (half 5). Furthermore, since the return values of (half 5) and (half 4) are necessarily the same, only one return value is shown for both calls.

syntax	`(trace-case-lambda name clause ...)`
returns	a traced procedure
libraries	`(chezscheme)`

A trace-case-lambda expression is equivalent to a case-lambda expression with the same clauses except that trace information is printed to the trace output port whenever the procedure is invoked, using name to identify the procedure. The trace information shows the value of the arguments passed to the procedure and the values returned by the procedure, with indentation to show the nesting of calls.

syntax	`(trace-let name ((var expr) ...) body₁ body₂ ...)`
returns	the values of the body `body₁ body₂ ...`
libraries	`(chezscheme)`

A trace-let expression is equivalent to a named let expression with the same name, bindings, and body except that trace information is printed to the trace output port on entry or reentry (via invocation of the procedure bound to name) into the trace-let expression.

A trace-let expression of the form

(trace-let name ([var expr] ...)
  body₁ body₂ ...)

can be rewritten in terms of trace-lambda as follows:

((letrec ([name
           (trace-lambda name (var ...)
             body₁ body₂ ...)])
   name)
 expr ...)

trace-let may be used to trace ordinary let expressions as well as let expressions as long as the name inserted along with the trace-let keyword in place of let does not appear free within the body of the let expression. It is also sometimes useful to insert a trace-let expression into a program simply to display the value of an arbitrary expression at the current trace indentation. For example, a call to the following variant of half

(define half
  (trace-lambda half (x)
    (cond
      [(zero? x) 0]
      [(odd? x) (half (trace-let decr-value () (- x 1)))]
      [(even? x) (+ (half (- x 1)) 1)])))

with argument 5 results in the trace:

|(half 5)
| (decr-value)
| 4
|(half 4)
| (half 3)
| |(decr-value)
| |2
| (half 2)
| |(half 1)
| | (decr-value)
| | 0
| |(half 0)
| 1
|2

syntax	`(trace-do ((var init update) ...) (test result ...) expr ...)`
returns	the values of the last `result` expression
libraries	`(chezscheme)`

A trace-do expression is equivalent to a do expression with the same subforms, except that trace information is printed to the trace output port, showing the values of var ... and each iteration and the final value of the loop on termination. For example, the expression

(trace-do ([old '(a b c) (cdr old)]
           [new '() (cons (car old) new)])
  ((null? old) new))

produces the trace

|(do (a b c) ())
|(do (b c) (a))
|(do (c) (b a))
|(do () (c b a))
|(c b a)

and returns (c b a).

syntax	`(trace var₁ var₂ ...)`
returns	a list of `var₁ var₂ ...`
syntax	`(trace)`
returns	a list of all currently traced top-level variables
libraries	`(chezscheme)`

In the first form, trace reassigns the top-level values of var₁ var₂ ..., whose values must be procedures, to equivalent procedures that display trace information in the manner of trace-lambda.

trace works by encapsulating the old value of each var in a traced procedure. It could be defined approximately as follows. (The actual version records and returns information about traced variables.)

(define-syntax trace
  (syntax-rules ()
    [(_ var ...)
     (begin
       (set-top-level-value! 'var
         (let ([p (top-level-value 'var)])
           (trace-lambda var args (apply p args))))
       ...)]))

Tracing for a procedure traced in this manner may be disabled via untrace (see below), an assignment of the corresponding variable to a different, untraced value, or a subsequent use of trace for the same variable. Because the value is traced and not the binding, however, a traced value obtained before tracing is disabled and retained after tracing is disabled will remain traced.

trace without subexpressions evaluates to a list of all currently traced variables. A variable is currently traced if it has been traced and not subsequently untraced or assigned to a different value.

The following transcript demonstrates the use of trace in an interactive session.

> (define half
    (lambda (x)
      (cond
        [(zero? x) 0]
        [(odd? x) (half (- x 1))]
        [(even? x) (+ (half (- x 1)) 1)])))
> (half 5)
2
> (trace half)
(half)
> (half 5)
|(half 5)
|(half 4)
| (half 3)
| (half 2)
| |(half 1)
| |(half 0)
| |0
| 1
|2
2
> (define traced-half half)
> (untrace half)
(half)
> (half 2)
1
> (traced-half 2)
|(half 2)
|1
1

syntax	`(untrace var₁ var₂ ...)`
syntax	`(untrace)`
returns	a list of untraced variables
libraries	`(chezscheme)`

untrace restores the original (pre-trace) top-level values of each currently traced variable in var₁ var₂ ..., effectively disabling the tracing of the values of these variables. Any variable in var₁ var₂ ... that is not currently traced is ignored. If untrace is called without arguments, the values of all currently traced variables are restored.

The following transcript demonstrates the use of trace and untrace in an interactive session to debug an incorrect procedure definition.

> (define square-minus-one
    (lambda (x)
      (- (* x x) 2)))
> (square-minus-one 3)
7
> (trace square-minus-one * -)
(square-minus-one * -)
> (square-minus-one 3)
|(square-minus-one 3)
| (* 3 3)
| 9
|(- 9 2)
|7
7
> (define square-minus-one
    (lambda (x)
      (- (* x x) 1))) ; change the 2 to 1
> (trace)
(- )
> (square-minus-one 3)
|( 3 3)
|9
|(- 9 1)
|8
8
> (untrace square-minus-one)
()
> (untrace * -)
(- *)
> (square-minus-one 3)
8

The first call to square-minus-one indicates there is an error, the second (traced) call indicates the step at which the error occurs, the third call demonstrates that the fix works, and the fourth call demonstrates that untrace does not wipe out the fix.

thread parameter	`trace-output-port`
libraries	`(chezscheme)`

trace-output-port is a parameter that determines the output port to which tracing information is sent. When called with no arguments, trace-output-port returns the current trace output port. When called with one argument, which must be a textual output port, trace-output-port changes the value of the current trace output port.

thread parameter	`trace-print`
libraries	`(chezscheme)`

The value of trace-print must be a procedure of two arguments, an object and an output port. The trace package uses the value of trace-print to print the arguments and return values for each call to a traced procedure. trace-print is set to pretty-print by default.

The trace package sets pretty-initial-indent to an appropriate value for the current nesting level before calling the value of trace-print so that multiline output can be indented properly.

syntax	`(trace-define var expr)`
syntax	`(trace-define (var . idspec) body₁ body₂ ...)`
returns	unspecified
libraries	`(chezscheme)`

trace-define is a convenient shorthand for defining variables bound to traced procedures of the same name. The first form is equivalent to

(define var
  (let ([x expr])
    (trace-lambda var args
      (apply x args))))

and the second is equivalent to

(define var
  (trace-lambda var idspec
    body₁ body₂ ...))

In the former case, expr must evaluate to a procedure.

> (let ()
    (trace-define plus
      (lambda (x y)
        (+ x y)))
    (list (plus 3 4) (+ 5 6)))
|(plus 3 4)
|7
(7 11)

syntax	`(trace-define-syntax keyword expr)`
returns	unspecified
libraries	`(chezscheme)`

trace-define-syntax traces the input and output to the transformer value of expr, stripped of the contextual information used by the expander to maintain lexical scoping.

> (trace-define-syntax let*
    (syntax-rules ()
      [(_ () b1 b2 ...)
       (let () b1 b2 ...)]
      [(_ ((x e) m ...) b1 b2 ...)
       (let ((x e))
         (let* (m ...) b1 b2 ...))]))
> (let* ([x 3] [y (+ x x)]) (list x y))
|(let* (let* [(x 3) (y (+ x x))] [list x y]))
|(let ([x 3]) (let* ([y (+ x x)]) (list x y)))
|(let* (let* [(y (+ x x))] [list x y]))
|(let ([y (+ x x)]) (let* () (list x y)))
|(let* (let* () [list x y]))
|(let () (list x y))
(3 6)

Without contextual information, the displayed forms are more readable but less precise, since different identifiers with the same name are indistinguishable, as shown in the example below.

> (let ([x 0])
    (trace-define-syntax a
      (syntax-rules ()
        [(_ y) (eq? x y)]))
    (let ([x 1])
      (a x)))
|(a (a x))
|(eq? x x)
#f

Section 3.2. The Interactive Debugger

The interactive debugger is entered as a result of a call to the procedure debug after an exception is handled by the default exception handler. It can also be entered directly from the default exception handler, for serious or non-warning conditions, if the parameter debug-on-exception is true.

Within the debugger, the command "?" lists the debugger command options. These include commands to:

inspect the raise continuation,
display the condition,
inspect the condition, and
exit the debugger.

The raise continuation is the continuation encapsulated within the condition, if any. The standard exception reporting procedures and forms assert, assertion-violation, and error as well as the Chez Scheme procedures assertion-violationf, errorf, and syntax-error all raise exceptions with conditions that encapsulate the continuations of their calls, allowing the programmer to inspect the frames of pending calls at the point of a violation, error, or failed assertion.

A variant of the interactive debugger, the break handler, is entered as the result of a keyboard interrupt handled by the default keyboard-interrupt handler or an explicit call to the procedure break handled by the default break handler. Again, the command "?" lists the command options. These include commands to:

exit the break handler and continue,
reset to the current café,
abort the entire Scheme session,
enter a new café,
inspect the current continuation, and
display program statistics (run time and memory usage).

It is also usually possible to exit from the debugger or break handler by typing the end-of-file character ("control-D" under Unix, "control-Z" under Windows).

procedure	`(debug)`
returns	does not return
libraries	`(chezscheme)`

When the default exception handler receives a serious or non-warning condition, it displays the condition and resets to the current café. Before it resets, it saves the condition in the parameter debug-condition. The debug procedure may be used to inspect the condition. Whenever one of the built-in error-reporting mechanisms is used to raise an exception, the continuation at the point where the exception was raised can be inspected as well. More generally, debug allows the continuation contained within any continuation condition created by make-continuation-condition to be inspected.

If the parameter debug-on-exception is set to #t, the default exception handler enters the debugger directly for all serious and non-warning conditions, delaying its reset until after the debugger exits. The --debug-on-exception command-line option may be used to set debug-on-exception to #t from the command line, which is particularly useful when debugging scripts or top-level programs run via the --script or --program command-line options.

Section 3.3. The Interactive Inspector

The inspector may be called directly via the procedure inspect or indirectly from the debugger. It allows the programmer to examine circular objects, objects such as ports and procedures that do not have a reader syntax, and objects such as continuations and variables that are not directly accessible by the programmer, as well as ordinary printable Scheme objects.

The primary intent of the inspector is examination, not alteration, of objects. The values of assignable variables may be changed from within the inspector, however. Assignable variables are generally limited to those for which assignments occur in the source program. It is also possible to invoke arbitrary procedures (including mutation procedures such as set-car!) on an object. No mechanism is provided for altering objects that are inherently immutable, e.g., nonassignable variables, procedures, and bignums, since doing so can violate assumptions made by the compiler and run-time system.

The user is presented with a prompt line that includes a printed representation of the current object, abbreviated if necessary to fit on the line. Various commands are provided for displaying objects and moving around inside of objects. On-line descriptions of the command options are provided. The command "?" displays commands that apply specifically to the current object. The command "??" displays commands that are always applicable. The command "h" provides a brief description of how to use the inspector. The end-of-file character or the command "q" exits the inspector.

procedure	`(inspect obj)`
returns	unspecified
libraries	`(chezscheme)`

Invokes the inspector on obj, as described above. The commands recognized by the inspector are listed below, categorized by the type of the current object.

Generally applicable commands

help or h displays a brief description of how to use the inspector.

? displays commands applicable to the current type of object.

?? displays the generally applicable commands.

print or p prints the current object (using pretty-print).

write or w writes the current object (using write).

size writes the size in bytes occupied by the current object (determined via compute-size), including any objects accessible from the current object except those for which the size was previously requested during the same interactive inspector session.

find expr [ g ] evaluates expr, which should evaluate to a procedure of one argument, and searches (via make-object-finder) for the first occurrence of an object within the current object for which the predicate returns a true value, treating immediate values (e.g., fixnums), values in generations older than g, and values already visited during the search as leaves. If g is not unspecified, it defaults to the current maximum generation, i.e., the value of collect-maximum-generation. If specified, g must be an exact nonnegative integer less than or equal to the current maximum generation or the symbol static representing the static generation. If such an object is found, the inspector’s focus moves to that object as if through a series of steps that lead from the current object to the located object, so that the up command can be used to determine where the object was found relative to the original object.

find-next repeats the last find, locating an occurrence not previously found, if any.

up or u n returns to the nth previous level. Used to move outwards in the structure of the inspected object. n defaults to 1.

top or t returns to the outermost level of the inspected object.

forward or f moves to the nth next expression. Used to move from one element to another of an object containing a sequence of elements, such as a list, vector, record, frame, or closure. n defaults to 1.

back or b moves to the nth previous expression. Used to move from one element to another of an object containing a sequence of elements, such as a list, vector, record, frame, or closure. n defaults to 1.

=> expr sends the current object to the procedure value of expr. expr may begin on the current or following line and may span multiple lines.

file path opens the source file at the specified path for listing. The parameter source-directories (Section 12.5) determines the set of directories searched for source files.

list line count lists count lines of the current source file (see file) starting at line. line defaults to the end of the previous set of lines listed and count defaults to ten or the number of lines previously listed. If line is negative, listing begins line lines before the previous set of lines listed.

files shows the currently open source files.

mark or m m marks the current location with the symbolic mark m. If m is not specified, the current location is marked with a unique default mark.

goto or g m returns to the location marked m. If m is not specified, the inspector returns to the location marked with the default mark.

new-cafe or n enters a new read-eval-print loop (café), giving access to the normal top-level environment.

quit or q exits from the inspector.

reset or r resets to the current café.

abort or a x aborts from Scheme with exit status x, which defaults to -1.

Continuation commands

show-frames or sf shows the next n frames. If n is not specified, all frames are displayed.

depth displays the number of frames in the continuation.

down or d n move to the nth frame down in the continuation. n defaults to 1.

show or s shows the continuation (next frame) and, if available, the calling procedure source, the pending call source, the closure, and the frame and free-variable values. Source is available only if generation of inspector information was enabled during compilation of the corresponding lambda expression.

show-local or sl is like show or s except that free variable values are not shown. (If present, free variable values can be found by inspecting the closure.)

length or l displays the number of elements in the topmost frame of the continuation.

ref or r moves to the nth or named frame element. n defaults to 0. If multiple elements have the same name, only one is accessible by name, and the others must be accessed by number.

code or c moves to the source for the calling procedure.

call moves to the source for the pending call.

file opens the source file containing the pending call, if known. The parameter source-directories (Section 12.5) determines the list of source directories searched for source files identified by relative path names.

For absolute pathnames starting with a / (or \ or a directory specifier under Windows), the inspector tries the absolute pathname first, then looks for the last (filename) component of the path in the list of source directories. For pathnames starting with ./ (or .\ under Windows) or ../ (or ..\ under Windows), the inspector looks in "." or ".." first, as appropriate, then for the entire .- or ..-prefixed pathname in the source directories, then for the last (filename) component in the source directories. For other (relative) pathnames, the inspector looks for the entire relative pathname in the list of source directories, then the last (filename) component in the list of source directories.

If a file by the same name as but different contents from the original source file is found during this process, it will be skipped over. This typically happens because the file has been modified since it was compiled. Pass an explicit filename argument to force opening of a particular file (see the generally applicable commands above).

eval or e expr evaluates the expression expr in an environment containing bindings for the elements of the frame. Within the evaluated expression, the value of each frame element n is accessible via the variable %n. Named elements are accessible via their names as well. Names are available only if generation of inspector information was enabled during compilation of the corresponding lambda expression.

set! or ! n e sets the value of the nth frame element to e, if the frame element corresponds to an assignable variable. n defaults to 0.

Procedure commands

show or s shows the source and free variables of the procedure. Source is available only if generation of inspector information was enabled during compilation of the corresponding lambda expression.

code or c moves to the source for the procedure.

file opens the file containing the procedure’s source code, if known. See the description of the continuation file entry above for more information.

length or l displays the number of free variables whose values are recorded in the procedure object.

ref or r moves to the nth or named free variable. n defaults to 0. If multiple free variables have the same name, only one is accessible by name, and the others must be accessed by number.

set! or ! n e sets the value of the nth free variable to e, if the variable is assignable. n defaults to 0.

eval or e expr evaluates the expression expr in an environment containing bindings for the free variables of the procedure. Within the evaluated expression, the value of each free variable n is accessible via the variable %n. Named free variables are accessible via their names as well. Names are available only if generation of inspector information was enabled during compilation of the corresponding lambda expression.

Pair (list) commands

show or s n shows the first n elements of the list. If n is not specified, all elements are displayed.

length or l displays the list length.

car moves to the object in the car of the current object.

cdr moves to the object in the cdr.

ref or r n moves to the nth element of the list. n defaults to 0.

tail n moves to the nth cdr of the list. n defaults to 1.

Vector, Bytevector, and Fxvector commands

show or s n shows the first n elements of the vector. If n is not specified, all elements are displayed.

length or l displays the vector length.

ref or r n moves to the nth element of the vector. n defaults to 0.

String commands

show or s n shows the first n elements of the string. If n is not specified, all elements are displayed.

length or l displays the string length.

ref or r n moves to the nth element of the string. n defaults to 0.

unicode n displays the first n elements of the string as hexadecimal Unicode scalar values.

ascii n displays the first n elements of the string as hexadecimal ASCII values, using -- to denote characters whose Unicode scalar values are not in the ASCII range.

Symbol commands

show or s shows the fields of the symbol.

value or v moves to the top-level value of the symbol.

name or n moves to the name of the symbol.

property-list or pl moves to the property list of the symbol.

ref or r n moves to the nth field of the symbol. Field 0 is the top-level value of the symbol, field 1 is the symbol’s name, and field 2 is its property list. n defaults to 0.

Character commands

unicode displays the hexadecimal Unicode scalar value for the character.

ascii displays the hexadecimal ASCII code for the character, using -- to denote characters whose Unicode scalar values are not in the ASCII range.

Box commands

show or s shows the contents of the box.

unbox or ref or r moves to the boxed object.

Port commands

show or s shows the fields of the port, including the input and output size, index, and buffer fields.

name moves to the port’s name.

handler moves to the port’s handler.

output-buffer or ob moves to the port’s output buffer.

input-buffer or ib moves to the port’s input buffer.

Record commands

show or s shows the contents of the record.

fields moves to the list of field names of the record.

name moves to the name of the record.

rtd moves to the record-type descriptor of the record.

ref or r name moves to the named field of the record, if accessible.

set! or ! name value sets the value of the named field of the record, if mutable.

Transport Link Cell (TLC) commands

show or s shows the fields of the TLC.

keyval moves to the keyval of the TLC.

tconc moves to the tconc of the TLC.

next moves to the next link of the TLC.

ref or r n moves to the nth field of the symbol. Field 0 is the keyval, field 1 the tconc, and field 2 the next link. n defaults to 0.

Section 3.4. The Object Inspector

A facility for noninteractive inspection is also provided to allow construction of different inspection interfaces. Like the interactive facility, it allows objects to be examined in ways not ordinarily possible. The noninteractive system follows a simple, object-oriented protocol. Ordinary Scheme objects are encapsulated in procedures, or inspector objects, that take symbolic messages and return either information about the encapsulated object or new inspector objects that encapsulate pieces of the object.

procedure	`(inspect/object object)`
returns	an inspector object procedure
libraries	`(chezscheme)`

inspect/object is used to turn an ordinary Scheme object into an inspector object. All inspector objects accept the messages type, print, write, and size. The type message returns a symbolic representation of the type of the object. The print and write messages must be accompanied by a port parameter. They cause a representation of the object to be written to the port, using the Scheme procedures pretty-print and write. The size message returns a fixnum representing the size in bytes occupied by the object, including any objects accessible from the current object except those for which the size was already requested via an inspector object derived from the argument of the same inspect/object call.

All inspector objects except for variable inspector objects accept the message value, which returns the actual object encapsulated in the inspector object.

(define x (inspect/object '(1 2 3)))
(x 'type) ⇒ pair
(define p (open-output-string))
(x 'write p)
(get-output-string p) ⇒ "(1 2 3)"
(x 'length) ⇒ (proper 3)
(define y (x 'car))
(y 'type) ⇒ simple
(y 'value) ⇒ 1

Pair inspector objects. Pair inspector objects contain Scheme pairs.

(pair-object 'type) returns the symbol pair.

(pair-object 'car) returns an inspector object containing the "car" field of the pair.

(pair-object 'cdr) returns an inspector object containing the "cdr" field of the pair.

(pair-object 'length) returns a list of the form (type count). The type field contains the symbol proper, the symbol improper, or the symbol circular, depending on the structure of the list. The count field contains the number of distinct pairs in the list.

Box inspector objects. Box inspector objects contain Chez Scheme boxes.

(box-object 'type) returns the symbol box.

(box-object 'unbox) returns an inspector object containing the contents of the box.

TLC inspector objects. Box inspector objects contain Chez Scheme boxes.

(tlc-object 'type) returns the symbol tlc.

(tlc-object 'keyval) returns an inspector object containing the TLC’s keyval.

(tlc-object 'tconc) returns an inspector object containing the TLC’s tconc.

(tlc-object 'next) returns an inspector object containing the TLC’s next link.

Vector, String, Bytevector, and Fxvector inspector objects. Vector (bytevector, string, fxvector) inspector objects contain Scheme vectors (bytevectors, strings, fxvectors).

(vector-object 'type) returns the symbol vector (string, bytevector, fxvector).

(vector-object 'length) returns the number of elements in the vector or string.

(vector-object 'ref n) returns an inspector object containing the nth element of the vector or string.

Simple inspector objects. Simple inspector objects contain unstructured, unmodifiable objects. These include numbers, booleans, the empty list, the end-of-file object, and the void object. They may be examined directly by asking for the value of the object.

(simple-object 'type) returns the symbol simple.

Unbound inspector objects. Although unbound objects are not normally accessible to Scheme programs, they may be encountered when inspecting variables.

(unbound-object 'type) returns the symbol unbound.

Procedure inspector objects. Procedure inspector objects contain Scheme procedures.

(procedure-object 'type) returns the symbol procedure.

(procedure-object 'length) returns the number of free variables.

(procedure-object 'ref n) returns an inspector object containing the nth free variable of the procedure. See the description below of variable inspector objects. n must be nonnegative and less than the length of the procedure.

(procedure-object 'eval expr) evaluates expr and returns its value. The values of the procedure’s free variables are bound within the evaluated expression to identifiers of the form %n, where n is the location number displayed by the inspector. The values of named variables are also bound to their names.

(procedure-object 'code) returns an inspector object containing the procedure’s code object. See the description below of code inspector objects.

Continuation inspector objects. Continuations created by call/cc are actually procedures. However, when inspecting such a procedure the underlying data structure that embodies the continuation may be exposed. A continuation structure contains the location at which computation is to resume, the variable values necessary to perform the computation, and a link to the next continuation.

(continuation-object 'type) returns the symbol continuation.

(continuation-object 'length) returns the number of free variables.

(continuation-object 'ref n) returns an inspector object containing the nth free variable of the continuation. See the description below of variable inspector objects. n must be nonnegative and less than the length of the continuation.

(continuation-object 'eval expr) evaluates expr and returns its value. The values of frame locations are bound within the evaluated expression to identifiers of the form %n, where n is the location number displayed by the inspector. The values of named locations are also bound to their names.

(continuation-object 'code) returns an inspector object containing the code object for the procedure that was active when the current continuation frame was created. See the description below of code inspector objects.

(continuation-object 'depth) returns the number of frames in the continuation.

(continuation-object 'link) returns an inspector object containing the next continuation frame. The depth must be greater than 1.

(continuation-object 'link* n) returns an inspector object containing the nth continuation link. n must be less than the depth.

(continuation-object 'source) returns an inspector object containing the source information attached to the continuation (representing the source for the application that resulted in the formation of the continuation) or #f if no source information is attached.

(continuation-object 'source-object) returns an inspector object containing the source object for the procedure application that resulted in the formation of the continuation or #f if no source object is attached.

(continuation-object 'source-path) attempts to find the pathname of the file containing the source for the procedure application that resulted in the formation of the continuation. If successful, three values are returned to identify the file and position of the application within the file: path, line, and char. Two values, a file name and an absolute character position, are returned if the file name is known but the named file cannot be found. The search may be unsuccessful even if a file by the expected name is found in the path if the file has been modified since the source code was compiled. If no file name is known, no values are returned. The parameter source-directories (Section 12.5) determines the set of directories searched for source files identified by relative path names.

Code inspector objects. Code inspector objects contain Chez Scheme code objects.

(code-object 'type) returns the symbol code.

(code-object 'name) returns a string or #f. The name associated with a code inspector object is the name of the variable to which the procedure was originally bound or assigned. Since the binding of a variable can be changed, this name association may not always be accurate. #f is returned if the inspector cannot determine a name for the procedure.

(code-object 'source) returns an inspector object containing the source information attached to the code object or #f if no source information is attached.

(continuation-object 'source-object) returns an inspector object containing the source object for the code object or #f if no source object is attached.

(code-object 'source-path) attempts to find the pathname of the file containing the source for the lambda expression that produced the code object. If successful, three values are returned to identify the file and position of the application within the file: path, line, and char. Two values, a file name and an absolute character position, are returned if the file name is known but the named file cannot be found. The search may be unsuccessful even if a file by the expected name is found in the path if the file has been modified since the source code was compiled. If no file name is known, no values are returned. The parameter source-directories (Section 12.5) determines the set of directories searched for source files identified by relative path names.

(code-object 'free-count) returns the number of free variables in any procedure for which this is the corresponding code.

Variable inspector objects. Variable inspector objects encapsulate variable bindings. Although the actual underlying representation varies, the variable inspector object provides a uniform interface.

(variable-object 'type) returns the symbol variable.

(variable-object 'name) returns a symbol or #f. #f is returned if the name is not available or if the variable is a compiler-generated temporary variable. Variable names are not retained when the parameter generate-inspector-information (Section 12.6) is false during compilation.

(variable-object 'ref) returns an inspector object containing the current value of the variable.

(variable-object 'set! e) returns unspecified, after setting the current value of the variable to e. An exception is raised with condition type &assertion if the variable is not assignable.

Port inspector objects. Port inspector objects contain ports.

(port-object 'type) returns the symbol port.

(port-object 'input?) returns #t if the port is an input port, #f otherwise.

(port-object 'output?) returns #t if the port is an output port, #f otherwise.

(port-object 'binary?) returns #t if the port is a binary port, #f otherwise.

(port-object 'closed?) returns #t if the port is closed, #f if the port is open.

(port-object 'name) returns an inspector object containing the port’s name.

(port-object 'handler) returns a procedure inspector object encapsulating the port handler, such as would be returned by port-handler.

(port-object 'output-size) returns the output buffer size as a fixnum if the port is an output port (otherwise the value is unspecified).

(port-object 'output-index) returns the output buffer index as a fixnum if the port is an output port (otherwise the value is unspecified).

(port-object 'output-buffer) returns an inspector object containing the string used for buffered output.

(port-object 'input-size) returns the input buffer size as a fixnum if the port is an input port (otherwise the value is unspecified).

(port-object 'input-index) returns the input buffer index as a fixnum if the port is an input port (otherwise the value is unspecified).

(port-object 'input-buffer) returns an inspector object containing the string used for buffered input.

Symbol inspector objects. Symbol inspector objects contain symbols. These include gensyms.

(symbol-object 'type) returns the symbol symbol.

(symbol-object 'name) returns a string inspector object. The string name associated with a symbol inspector object is the print representation of a symbol, such as would be returned by the procedure symbol->string.

(symbol-object 'gensym?) returns #t if the symbol is a gensym, #f otherwise. Gensyms are created by gensym.

(symbol-object 'top-level-value) returns an inspector object containing the global value of the symbol.

(symbol-object 'property-list) returns an inspector object containing the property list for the symbol.

Record inspector objects. Record inspector objects contain records.

(record-object 'type) returns the symbol record.

(record-object 'name) returns a string inspector object corresponding to the name of the record type.

(record-object 'fields) returns an inspector object containing a list of the field names of the record type.

(record-object 'length) returns the number of fields.

(record-object 'rtd) returns an inspector object containing the record-type descriptor of the record type.

(record-object 'accessible? name) returns #t if the named field is accessible, #f otherwise. A field may be inaccessible if optimized away by the compiler.

(record-object 'ref name) returns an inspector object containing the value of the named field. An exception is raised with condition type &assertion if the named field is not accessible.

(record-object 'mutable? name) returns #t if the named field is mutable, #f otherwise. A field is immutable if it is not declared mutable or if the compiler optimizes away all assignments to the field.

(record-object 'set! name value) sets the value of the named field to value. An exception is raised with condition type &assertion if the named field is not assignable.

Section 3.5. Locating objects

procedure	`(make-object-finder pred)`
procedure	`(make-object-finder pred g)`
procedure	`(make-object-finder pred x g)`
returns	see below
libraries	`(chezscheme)`

The procedure make-object-finder takes a predicate pred and two optional arguments: a starting point x and a maximum generation g. The starting point defaults to the value of the procedure oblist, and the maximum generation defaults to the value of the parameter collect-maximum-generation. make-object-finder returns an object finder p that can be used to search for objects satisfying pred within the starting-point object x. Immediate objects and objects in generations older than g are treated as leaves. p is a procedure accepting no arguments. If an object y satisfying pred can be found starting with x, p returns a list whose first element is y and whose remaining elements represent the path of objects from x to y, listed in reverse order. p can be invoked multiple times to find additional objects satisfying the predicate, if any. p returns #f if no more objects matching the predicate can be found.

p maintains internal state recording where it has been so it can restart at the point of the last found object and not return the same object twice. The state can be several times the size of the starting-point object x and all that is reachable from x.

The interactive inspector provides a convenient interface to the object finder in the form of find and find-next commands.

Relocation tables for static code objects are discarded by default, which prevents object finders from providing accurate results when static code objects are involved. That is, they will not find any objects pointed to directly from a code object that has been promoted to the static generation. If this is a problem, the command-line argument --retain-static-relocation can be used to prevent the relocation tables from being discarded.

Section 3.6. Nested object size and composition

The procedures compute-size and compute-composition can be used to determine the size or composition of an object, including anything reachable via pointers from the object. Depending on the number of objects reachable from the object, the procedures potentially allocate a large amount of memory. In an application for which knowing the number, size, generation, and types of all objects in the heap is sufficient, object-counts is potentially much more efficient.

These procedures treat immediate objects such as fixnums, booleans, and characters as zero-count, zero-byte leaves.

By default, these procedures also treat static objects (those in the initial heap) as zero-count, zero-byte leaves. Both procedures accept an optional second argument that specifies the maximum generation of interest, with the symbol static being used to represent the static generation.

Objects sometimes point to a great deal more than one might expect. For example, if static data is included, the procedure value of (lambda (x) x) points indirectly to the exception handling subsystem (because of the argument-count check) and many other things as a result of that.

Relocation tables for static code objects are discarded by default, which prevents these procedures from providing accurate results when static code objects are involved. That is, they will not find any objects pointed to directly from a code object that has been promoted to the static generation. If accurate sizes and compositions for static code objects are required, the command-line argument --retain-static-relocation can be used to prevent the relocation tables from being discarded.

procedure	`(compute-size object)`
procedure	`(compute-size object generation)`
returns	see below
libraries	`(chezscheme)`

object can be any object. generation must be a fixnum between 0 and the value of collect-maximum-generation, inclusive, or the symbol static. If generation is not supplied, it defaults to the value of collect-maximum-generation.

compute-size returns the amount of memory, in bytes, occupied by object and anything reachable from object in any generation less than or equal to generation. Immediate values such as fixnums, booleans, and characters have zero size.

The following examples are valid for machines with 32-bit pointers.

(compute-size 0) ⇒ 0
(compute-size (cons 0 0)) ⇒ 8
(compute-size (cons (vector #t #f) 0)) ⇒ 24

(compute-size
  (let ([x (cons 0 0)])
    (set-car! x x)
    (set-cdr! x x)
    x))                  ⇒ 8

(define-record-type frob (fields x))
(collect 1 1) ; force rtd into generation 1
(compute-size
  (let ([x (make-frob 0)])
    (cons x x))
  0)                       ⇒ 16

procedure	`(compute-composition object)`
procedure	`(compute-composition object generation)`
returns	see below
libraries	`(chezscheme)`

compute-composition returns an association list representing the composition of object, including anything reachable from it in any generation less than or equal to generation. The association list has the following structure:

((type count . bytes) ...)

type is either the name of a primitive type, represented as a symbol, e.g., pair, or a record-type descriptor (rtd). count and bytes are nonnegative fixnums.

Immediate values such as fixnums, booleans, and characters are not included in the composition.

The following examples are valid for machines with 32-bit pointers.

(compute-composition 0) ⇒ ()
(compute-composition (cons 0 0)) ⇒ ((pair 1 . 8))
(compute-composition
  (cons (vector #t #f) 0)) ⇒ ((pair 1 . 8) (vector 1 . 16))

(compute-composition
  (let ([x (cons 0 0)])
    (set-car! x x)
    (set-cdr! x x)
    x))                 ⇒ ((pair 1 . 8)

(define-record-type frob (fields x))
(collect 1 1) ; force rtd into generation 1
(compute-composition
  (let ([x (make-frob 0)])
    (cons x x))
  0)                       ⇒ ((pair 1 . 8)
                                (#<record type frob> 1 . 8))

Chapter 4. Foreign Interface

Chez Scheme provides two ways to interact with "foreign" code, i.e., code written in other languages. The first is via subprocess creation and communication, which is discussed in the Section 4.1. The second is via static or dynamic loading and invocation from Scheme of procedures written in C and invocation from C of procedures written in Scheme. These mechanisms are discussed in Sections 4.2 through 4.4.

The method for static loading of C object code is dependent upon which machine you are running; see the installation instructions distributed with Chez Scheme.

Section 4.1. Subprocess Communication

Two procedures, system and process, are used to create subprocesses. Both procedures accept a single string argument and create a subprocess to execute the shell command contained in the string. The system procedure waits for the process to exit before returning, however, while the process procedure returns immediately without waiting for the process to exit. The standard input and output files of a subprocess created by system may be used to communicate with the user’s console. The standard input and output files of a subprocess created by process may be used to communicate with the Scheme process.

procedure	`(system command)`
returns	see below
libraries	`(chezscheme)`

command must be a string.

The system procedure creates a subprocess to perform the operation specified by command. The subprocess may communicate with the user through the same console input and console output files used by the Scheme process. After creating the subprocess, system waits for the process to exit before returning.

When the subprocess exits, system returns the exit code for the subprocess, unless (on Unix-based systems) a signal caused the subprocess to terminate, in which case system returns the negation of the signal that caused the termination, e.g., -1 for SIGHUP.

procedure	`(open-process-ports command)`
procedure	`(open-process-ports command b-mode)`
procedure	`(open-process-ports command b-mode ?transcoder)`
returns	see below
libraries	`(chezscheme)`

command must be a string. If ?transcoder is present and not #f, it must be a transcoder, and this procedure creates textual ports, each of whose transcoder is ?transcoder. Otherwise, this procedure returns binary ports. b-mode specifies the buffer mode used by each of the ports returned by this procedure and defaults to block. Buffer modes are described in Section 7.2 of The Scheme Programming Language, 4th Edition.

open-process-ports creates a subprocess to perform the operation specified by command. Unlike system, process returns immediately after creating the subprocess, i.e., without waiting for the subprocess to terminate. It returns four values:

to-stdin is an output port to which Scheme can send output to the subprocess through the subprocess’s standard input file.
from-stdout is an input port from which Scheme can read input from the subprocess through the subprocess’s standard output file.
from-stderr is an input port from which Scheme can read input from the subprocess through the subprocess’s standard error file.
process-id is an integer identifying the created subprocess provided by the host operating system.

If the process exits or closes its standard output file descriptor, any procedure that reads input from from-stdout will return an end-of-file object. Similarly, if the process exits or closes its standard error file descriptor, any procedure that reads input from from-stderr will return an end-of-file object.

The predicate input-port-ready? may be used to detect whether input has been sent by the subprocess to Scheme.

It is sometimes necessary to force output to be sent immediately to the subprocess by invoking flush-output-port on to-stdin, since Chez Scheme buffers the output for efficiency.

On UNIX systems, the process-id is the process identifier for the shell created to execute command. If command is used to invoke an executable file rather than a shell command, it may be useful to prepend command with the string "exec ", which causes the shell to load and execute the named executable directly, without forking a new process---the shell equivalent of a tail call. This will reduce by one the number of subprocesses created and cause process-id to reflect the process identifier for the executable once the shell has transferred control.

procedure	`(process command)`
returns	see explanation
libraries	`(chezscheme)`

command must be a string.

process is similar to open-process-ports, but less general. It does not return a port from which the subproces’s standard error output can be read, and it always creates textual ports. It returns a list of three values rather than the four separate values of open-process-ports. The returned list contains, in order: from-stdout, to-stdin, and process-id, which correspond to the second, first, and fourth return values of open-process-ports.

Section 4.2. Calling out of Scheme

Chez Scheme's foreign-procedure interface allows a Scheme program to invoke procedures written in C or in languages that obey the same calling conventions as C. Two steps are necessary before foreign procedures can be invoked from Scheme. First, the foreign procedure must be compiled and loaded, either statically or dynamically, as described in Section 4.6. Then, access to the foreign procedure must be established in Scheme, as described in this section. Once access to a foreign procedure has been established it may be called as an ordinary Scheme procedure.

Since foreign procedures operate independently of the Scheme memory management and exception handling system, great care must be taken when using them. Although the foreign-procedure interface provides type checking (at optimize levels less than 3) and type conversion, the programmer must ensure that the sharing of data between Scheme and foreign procedures is done safely by specifying proper argument and result types.

Scheme-callable wrappers for foreign procedures can also be created via ftype-ref and function ftypes (Section 4.5).

syntax	`(foreign-procedure conv ... entry-exp (param-type ...) res-type)`
returns	a procedure
libraries	`(chezscheme)`

entry-exp must evaluate to a string representing a valid foreign procedure entry point or an integer representing the address of the foreign procedure. The param-types and res-type must be symbols or structured forms as described below. When a foreign-procedure expression is evaluated, a Scheme procedure is created that will invoke the foreign procedure specified by entry-exp. When the procedure is called each argument is checked and converted according to the specified param-type before it is passed to the foreign procedure. The result of the foreign procedure call is converted as specified by the res-type. Multiple procedures may be created for the same foreign entry.

Each conv adjusts specifies the calling convention to be used. A #f is allowed as conv to indicate the default calling convention on the target machine (so the #f has no effect). Three other conventions are currently supported under Windows: __stdcall, __cdecl, and __com (32-bit only). Since __cdecl is the default, specifying __cdecl is equivalent to specifying #f or no convention. Finally, conv can be __collect_safe to indicate that garbage collection is allowed concurrent to a call of the foreign procedure.

Use __stdcall to access most Windows API procedures. Use __cdecl for Windows API varargs procedures, for C library procedures, and for most other procedures. Use __com to invoke COM interface methods; COM uses the __stdcall convention but additionally performs the indirections necessary to obtain the correct method from a COM instance. The address of the COM instance must be passed as the first argument, which should normally be declared as iptr. For the __com interface only, entry-exp must evaluate to the byte offset of the method in the COM vtable. For example,

(foreign-procedure __com 12 (iptr double-float) integer-32)

creates an interface to a COM method at offset 12 in the vtable encapsulated within the COM instance passed as the first argument, with the second argument being a double float and the return value being an integer.

Use __collect_safe to declare that garbage collection is allowed concurrent to the foreign procedure. The __collect_safe declaration allows concurrent collection by deactivating the current thread (see fork-thread) when the foreign procedure is called, and the thread is activated again when the foreign procedure returns. The __collect_safe declaration is useful, for example, when calling a blocking I/O call to allow other Scheme threads to run normally. Refrain from passing collectable memory to a __collect_safe foreign procedure, or use lock-object to lock the memory in place; see also Sdeactivate_thread. The __collect_safe declaration has no effect on a non-threaded version of the system.

For example, calling the C sleep function with the default convention will block other Scheme threads from performing a garbage collection, but adding the __collect_safe declaration avoids that problem:

(define c-sleep
  (foreign-procedure __collect_safe "sleep" (unsigned) unsigned))
(c-sleep 10) ; sleeps for 10 seconds without blocking other threads

If a foreign procedure that is called with __collect_safe can invoke callables, then each callable should also be declared with __collect_safe so that the callable reactivates the thread.

Complete type checking and conversion is performed on the parameters to a foreign procedure. The types scheme-object, string, wstring, u8*, u16*, u32*, utf-8, utf-16le, utf-16be, utf-32le, and utf-32be, must be used with caution, however, since they allow allocated Scheme objects to be used in places the Scheme memory management system cannot control. No problems will arise as long as such objects are not retained in foreign variables or data structures while Scheme code is running, and as long as they are not passed as arguments to a __collect_safe procedure, since garbage collection can occur only while Scheme code is running or when concurrent garbage collection is enabled. Other parameter types are converted to equivalent foreign representations and consequently they can be retained indefinitely in foreign variables and data structures.

For argument types string, wstring, utf-8, utf-16le, utf-16be, utf-32le, and utf-32be, an argument is converted to a fresh object that is passed to the foreign procedure. Since the fresh object is not accessible for locking before the call, it can never be treated correctly for a __collect_safe foreign procedure, so those types are disallowed as argument types for a __collect_safe foreign procedure. For analogous reasons, those types are disallowed as the result of a __collect_safe foreign callable.

Following are the valid parameter types:

integer-8: Exact integers from -2⁷ through 2⁸ - 1 are valid. Integers in the range 2⁷ through 2⁸ - 1 are treated as two’s complement representations of negative numbers, e.g., #xff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually signed char).

unsigned-8: Exact integers from -2⁷ to 2⁸ - 1 are valid. Integers in the range -2⁷ through -1 are treated as the positive equivalents of their two’s complement representation, e.g., -1 is treated as #xff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned char).

integer-16: Exact integers from -2¹⁵ through 2¹⁶ - 1 are valid. Integers in the range 2¹⁵ through 2¹⁶ - 1 are treated as two’s complement representations of negative numbers, e.g., #xffff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually short).

unsigned-16: Exact integers from -2¹⁵ to 2¹⁶ - 1 are valid. Integers in the range -2¹⁵ through -1 are treated as the positive equivalents of their two’s complement representation, e.g., -1 is treated as #xffff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned short).

integer-32: Exact integers from -2³¹ through 2³² - 1 are valid. Integers in the range 2³¹ through 2³² - 1 are treated as two’s complement representations of negative numbers, e.g., #xffffffff is treated as -1. The argument is passed to C as an integer of the appropriate size (usually int).

unsigned-32: Exact integers from -2³¹ to 2³² - 1 are valid. Integers in the range -2³¹ through -1 are treated as the positive equivalents of their two’s complement representation, e.g., -1 is treated as #xffffffff. The argument is passed to C as an unsigned integer of the appropriate size (usually unsigned int).

integer-64: Exact integers from -2⁶³ through 2⁶⁴ - 1 are valid. Integers in the range 2⁶³ through 2⁶⁴ - 1 are treated as two’s complement representations of negative numbers. The argument is passed to C as an integer of the appropriate size (usually long long or, on many 64-bit platforms, long).

unsigned-64: Exact integers from -2⁶³ through 2⁶⁴ - 1 are valid. Integers in the range -2⁶³ through -1 are treated as the positive equivalents of their two’s complement representation, The argument is passed to C as an integer of the appropriate size (usually unsigned long long or, on many 64-bit platforms, long).

double-float: Only Scheme flonums are valid---other Scheme numeric types are not automatically converted. The argument is passed to C as a double float.

single-float: Only Scheme flonums are valid---other Scheme numeric types are not automatically converted. The argument is passed to C as a single float. Since Chez Scheme represents flonums in double-float format, the parameter is first converted into single-float format.

short: This type is an alias for the appropriate fixed-size type above, depending on the size of a C short.

unsigned-short: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned short.

int: This type is an alias for the appropriate fixed-size type above, depending on the size of a C int.

unsigned: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned.

unsigned-int: This type is an alias unsigned. fixed-size type above, depending on the size of a C unsigned.

long: This type is an alias for the appropriate fixed-size type above, depending on the size of a C long.

unsigned-long: This type is an alias for the appropriate fixed-size type above, depending on the size of a C unsigned long.

long-long: This type is an alias for the appropriate fixed-size type above, depending on the size of the nonstandard C type long long.

unsigned-long-long: This type is an alias for the appropriate fixed-size type above, depending on the size of the nonstandard C type unsigned long long.

ptrdiff_t: This type is an alias for the appropriate fixed-size type above, depending on its definition in the host machine’s stddef.h include file.

size_t: This type is an alias for the appropriate unsigned fixed-size type above, depending on its definition in the host machine’s stddef.h include file.

ssize_t: This type is an alias for the appropriate signed fixed-size type above, depending on its definition in the host machine’s stddef.h include file.

iptr: This type is an alias for the appropriate fixed-size type above, depending on the size of a C pointer.

uptr: This type is an alias for the appropriate (unsigned) fixed-size type above, depending on the size of a C pointer.

void*: This type is an alias for uptr.

fixnum: This type is equivalent to iptr, except only values in the fixnum range are valid. Transmission of fixnums is slightly faster than transmission of iptr values, but the fixnum range is smaller, so some iptr values do not have a fixnum representation.

boolean: Any Scheme object may be passed as a boolean. #f is converted to 0; all other objects are converted to 1. The argument is passed to C as an int.

char: Only Scheme characters with Unicode scalar values in the range 0 through 255 are valid char parameters. The character is converted to its Unicode scalar value, as with char->integer, and passed to C as an unsigned char.

wchar_t: Only Scheme characters are valid wchar_t parameters. Under Windows and any other system where wchar_t holds only 16-bit values rather than full Unicode scalar values, only characters with 16-bit Unicode scalar values are valid. On systems where wchar_t is a full 32-bit value, any Scheme character is valid. The character is converted to its Unicode scalar value, as with char->integer, and passed to C as a wchar_t.

wchar: This type is an alias for wchar_t.

double: This type is an alias for double-float.

float: This type is an alias for single-float.

scheme-object: The argument is passed directly to the foreign procedure; no conversion or type checking is performed. This form of parameter passing should be used with discretion. Scheme objects should not be preserved in foreign variables or data structures since the memory management system may relocate them between foreign procedure calls.

ptr: This type is an alias for scheme-object.

u8*: The argument must be a Scheme bytevector or #f. For #f, the null pointer (0) is passed to the foreign procedure. For a bytevector, a pointer to the first byte of the bytevector’s data is passed. If the C routine to which the data is passed requires the input to be null-terminated, a null (0) byte must be included explicitly in the bytevector. The bytevector should not be retained in foreign variables or data structures, since the memory management system may relocate or discard them between foreign procedure calls, and use their storage for some other purpose.

u16*: Arguments of this type are treated just like arguments of type u8*. If the C routine to which the data is passed requires the input to be null-terminated, two null (0) bytes must be included explicitly in the bytevector, aligned on a 16-bit boundary.

u32*: Arguments of this type are treated just like arguments of type u8*. If the C routine to which the data is passed requires the input to be null-terminated, four null (0) bytes must be included explicitly in the bytevector, aligned on a 32-bit boundary.

utf-8: The argument must be a Scheme string or #f. For #f, the null pointer (0) is passed to the foreign procedure. A string is converted into a bytevector, as if via string->utf8, with an added null byte, and the address of the first byte of the bytevector is passed to C. The bytevector should not be retained in foreign variables or data structures, since the memory management system may relocate or discard them between foreign procedure calls and use their storage for some other purpose. The utf-8 argument type is not allowed for a __collect_safe foreign procedure.

utf-16le: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf16 with endianness little, and they are extended by two null bytes rather than one.

utf-16be: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf16 with endianness big, and they are extended by two null bytes rather than one.

utf-32le: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf32 with endianness little, and they are extended by four null bytes rather than one.

utf-32be: Arguments of this type are treated like arguments of type utf-8, except they are converted as if via string->utf32 with endianness big, and they are extended by four null bytes rather than one.

string: This type is an alias for utf-8.

wstring: This type is an alias for utf-16le, utf-16be, utf-32le, or utf-32be as appropriate depending on the size of a C wchar_t and the endianness of the target machine. For example, wstring is equivalent to utf-16le under Windows running on Intel hardware.

(* ftype): This type allows a pointer to a foreign type (ftype) to be passed. The argument must be an ftype pointer of type ftype, and the actual argument is the address encapsulated in the ftype pointer. See Section 4.5 for a description of foreign types.

(& ftype): This type allows a foreign type (ftype) to be passed as a value, but represented on the Scheme side as a pointer to the foreign-type data. That is, a (& ftype) argument is represented on the Scheme side the same as a (* ftype) argument, but a (& ftype) argument is passed to the foreign procedure as the content at the foreign pointer’s address instead of as the address. For example, if ftype is a struct type, then (& ftype) passes a struct argument instead of a struct-pointer argument. The ftype cannot refer to an array type.

The result types are similar to the parameter types with the addition of a void type. In general, the type conversions are the inverse of the parameter type conversions. No error checking is performed on return, since the system cannot determine whether a foreign result is actually of the indicated type. Particular caution should be exercised with the result types scheme-object, double-float, double, single-float, float, and the types that result in the construction of bytevectors or strings, since invalid return values may lead to invalid memory references as well as incorrect computations. Following are the valid result types:

void: The result of the foreign procedure call is ignored and an unspecified Scheme object is returned. void should be used when foreign procedures are called for effect only.

integer-8: The result is interpreted as a signed 8-bit integer and is converted to a Scheme exact integer.

unsigned-8: The result is interpreted as an unsigned 8-bit integer and is converted to a Scheme nonnegative exact integer.

integer-16: The result is interpreted as a signed 16-bit integer and is converted to a Scheme exact integer.

unsigned-16: The result is interpreted as an unsigned 16-bit integer and is converted to a Scheme nonnegative exact integer.

integer-32: The result is interpreted as a signed 32-bit integer and is converted to a Scheme exact integer.

unsigned-32: The result is interpreted as an unsigned 32-bit integer and is converted to a Scheme nonnegative exact integer.

integer-64: The result is interpreted as a signed 64-bit integer and is converted to a Scheme exact integer.

unsigned-64: The result is interpreted as an unsigned 64-bit integer and is converted to a Scheme nonnegative exact integer.

double-float: The result is interpreted as a double float and is translated into a Chez Scheme flonum.

single-float: The result is interpreted as a single float and is translated into a Chez Scheme flonum. Since Chez Scheme represents flonums in double-float format, the result is first converted into double-float format.