Cinquecento Manual
Introduction
This manual describes the language and built-in library functions of the L1 implementation of Cinquecento.
Cinquecento is a programming language designed for examining programs in execution. It features an extensible abstraction, called a domain, that represents a program in execution as a language value, and a syntactic interface to domains based on the C programming language.
Cinquecento programs use fragments of C — employing C operators, types, and control flow — to examine or modify the state of target programs represented by domains. Complex C idioms for traversing data structures, such as pointer-chasing loops, usually port with little or no effort. The names and types of target program data are specified in the syntax of C declarations; minor extensions allow machine-level specification of location, layout, and encoding of data in memory.
Multiple simultaneously active target programs can be represented as distinct domains within a single Cinquecento program. Heterogeneity is expected: the targets may be different programs, or the same program with different types, symbols, or executable format, and the targets may execute on different machines, possibly of different architecture or operating system. Example applications include comparing executions of different versions of a program, comparing executions of a program across machines or operating systems, and validating global state in a distributed program.
Each domain encapsulates user-defined functions that catalog the types and symbols of the target, plus functions that read and write its memory and other run-time state. Rather than favor the idiosyncrasies of one operating system, architecture, compiler, or debugging information format, Cinquecento specifies a small, extensible interface that allows users to tailor domains for their target environment. New domain implementations can be written entirely in Cinquecento; built-in library functions and accompanying programs simplify construction of domains for common environments, such as DWARF-based x86 Linux systems.
Apart from domains and the C-based syntax, Cinquecento is a simple dynamic language with conventional functional semantics. A dialect of Scheme at heart, Cinquecento is lexically scoped, dynamically typed, properly tail recursive, and has first-class functions, continuations, and automatic memory management. Some Schemers may cluck at Cinquecento's numbers (which rigorously emulate the numbers of C), its fledgling support for syntactic extension, and its indulgence of imperative habits.
The Cinquecento language design and implementation is an active research project, still undergoing revision, extension, and tuning. We strive to keep this manual current with the implementation it accompanies, but until things settle down, sometimes we will fall behind.
We welcome feedback of any kind from all users.
Prerequisites
Cinquecento is meant to serve programmers who are fluent in both C and Scheme. The essence of Cinquecento programming is the use of functional abstraction, in the style epitomized by Scheme, to program the evaluation of expressions involving the syntax, operators, types, memory model, and control flow of idiomatic C.
Many guides to artful use of C and Scheme are available in libraries and on the web. Throughout this manual we avoid dispensing nutshell caricatures of ideas and traditions explained well elsewhere. Instead here we suggest a few starting points from which one cannot be led astray.
The Scheme Programming Language offers both an introduction to Scheme and exposition of its many fine points. The point most central to Cinquecento is the role of lambda as an operator for constructing stateful first-class functions. A taste and habit for stylistic use of lambda can be cultivated by studying The Little Schemer or Structure and Interpretation of Computer Programs.
Into its Scheme-like setting, Cinquecento carries nearly every feature of C other than static typing. Understanding Cinquecento requires familiarity with the sharp edges of C, especially sensitivity to the meaning of types and operations involving pointers and memory. The C Programming Language offers a concise and definitive foundation, while C: A Reference Manual reveals many more edges.
Looking from another direction, the Cinquecento programmer is primarily in the business of making, operating, and extending custom debuggers. Sooner or later, it will be useful to understand how debuggers represent and control program executions in your environment.
Interactive Use
L1 includes a read-eval-print loop through which Cinquecento expressions can be interactively evaluated.
Assuming that L1 has been made available in your path, you can start the REPL in a shell by running the command l1:
% l1 ;
The semicolon is the REPL prompt. Typing a complete expression followed a newline sends the expression to the evaluator. The resulting value is printed on the following line, followed by a new prompt:
; 3+3;
6
;
Most Cinquecento expressions end with a semicolon. (Welcome to the C part of the language.) In this manual, we show the printed output of interactions with the evaluator in italics.
Some expressions cause output to be printed. We show such output along with the value printed by the REPL.
; { printf("hello, world\n"); 3+3; } hello, world 6 ;
(Like blocks of statements in C, blocks of expressions enclosed in curly braces do not require the final semicolon.)
An error encountered during expression evaluation aborts the evaluation and prints a diagnostic and stack trace:
; for(i = 4; i >= 0; i--) printf("%d\n", 4/i);
1
1
2
4
error: divide by zero
entry (<stdin>:1)
Long-running or non-terminating evaluations can be interrupted by typing control-C. Typing control-D exits the REPL.
The REPL does not offer line editing or expression history. Some users run L1 through a standalone tool such as rlwrap or from within Emacs to obtain editing and history service.
The REPL does not accept multi-line input: the newline signals the end of the expression. In this manual, to make the presentation easier to read, we sometimes present multi-line input anyway. The usual way to handle longer input is to put it in a file.
The REPL can be directed to evaluate a file of Cinquecento code with the @include form:
; @include "/home/me/file.cqct"
Alternatively, the file name can be passed as an argument to l1:
% l1 /home/me/file.cqct
in which case L1 exits after evaluating the contents of the file.
An Emacs major mode for editing Cinquecento code is included in the directory doc in the L1 source distribution.
Some demonstrations of Cinquecento are included in the directory demo in the L1 source distribution. They make use of the process control server sctl, which is available from cqctworld.org. Its usage is documented in a separate manual.
Input
A Cinquecento program consists of a sequence of expressions to be evaluated in order by the Cinquecento evaluator.
The Cinquecento evaluator reads code from its current input file. During parsing of an input file, an occurrence (not within a string literal or comment) of either of the forms
@include "filename"
is replaced with the contents of the named file. The first form searches for filename in the Cinquecento system load path. The second form searches the directory that contains the current input file.
The current input file is temporarily switched to filename during the substitution.
Comments
The following are comments:
- Input enclosed by /* and */;
- Input between // and the beginning of the next line;
- Input enclosed by #! and !#. (This syntax is intended for executable scripts on Unix.)
Keywords
The following are the keywords of Cinquecento.
@containerof _Bool enum struct @define break float switch @defloc case for typedef @lambda char if union @let const int unsigned @local continue long void @global default return volatile @names do short while @record double signed @typeof else sizeof
Identifiers
Identifiers are non-keyword tokens used to identify Cinquecento variables and the symbols represented within domains.
The first character of an identifier must be a letter or underscore. Subsequent characters must be letters, underscores, or digits.
There is an exception: the two character sequence :: may appear anywhere within an identifier, including at the beginning.
Literals
Literals are lexical tokens that encode character, string, and integer constants. Their syntax follows that of C literals.
A character literal is a printable ASCII character surrounded by single quotes.
Cinquecento also recognizes the C character escape sequences: \a, \b, \f, \n, \r, \t, \v, \\, \?, \', \", \o, \oo, \ooo, \xh...h.
A string literal is a sequence of ASCII or escaped characters surrounded by double quotes.
Two string literals separated by whitespace are concatenated into a single string.
An integer literal is a sequence of digit characters. With no prefix, integer literals are interpreted in decimal. The 0x prefix and 0X prefix causes hexadecimal interpretation. The 0 prefix causes octal interpretation. The 0b prefix and 0B prefix causes binary interpretation.
By default, the C type imposed on an integer constant is is the smallest integer type (up to int?) that can represent the value. The suffixes U, UL, ULL, L, LL, and their lowercase variants impose the specified long and/or unsigned types upon the value. The suffixes K, M, G, T, and their lowercase variants impose the type unsigned long long upon the value and shift the value left by 10, 20, 30, or 40 bits, respectively.
Scope and Variables
Cinquecento is lexically scoped and dynamically typed.
The block statement:
{
statement;
...
}
introduces a new level of lexical scope.
Expressions that appear outside of a block statement are called top-level expressions. The outermost level of scope, called the top-level environment, contains bindings for built-in functions and other values.
Bindings are mutable. The assignment form
updates the value bound to the variable var to the value of expr.
If var is an unbound variable, the = operator implicitly creates a binding for var that is visible in the innermost level of lexical scope. If the assignment occurs in a top-level expression, a new binding is added to the top-level environment. These rules also apply to the other assignment operators (+=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>=, ++, --).
The @local statement explicitly declares new variable bindings for the block in which it appears:
{ @local var, ... ; ... statement; ... }
establishes new bindings for the variables var ... within the enclosing block. These bindings shadow any other bindings of var ... in enclosing blocks or the top-level environment. Any number of @local statements may appear in a block, but they all must appear before any other statements.
The @global statement declares new variable bindings in the top-level environment:
@global var, ... ;
Variables declared with @local and @global are initially bound to nil.
The parameters of a @lambda form establish local bindings in the block that forms the body of the function definition, just as if they had been declared with @local.
A reference to an unbound variable is resolved in the top-level environment. It is an error if no binding to the variable has occurred before the reference is evaluated.
Types
Cinquecento defines several types of values. Every value in Cinquecento belongs to exactly one type. For each type there is a predicate to test whether a value belongs to the type.
Types with standard semantics include string, pair, function, record, record descriptor, and file descriptor.
The types special to Cinquecento are cid, cvalue, ctype, domain, address space, name space, and range. These types represent different aspects of a program in execution; their roles are described in the overview of domains.
There are also several container types for storing values of any type: list, vector, and table.
The special value nil is not a member of any other type or equal to any other value.
There are no numeric types other than cvalue.
The following functions are predicates on the type of a value. They each test their argument val for membership in a particular type and return the cvalue 1 if the value belongs to the type, or 0 otherwise.
Address space |
isas(val)
|
C identifier |
iscid(val)
|
C type |
isctype(val)
|
C value |
iscvalue(val)
|
Domain |
isdom(val)
|
File descriptor |
isfd(val)
|
List |
islist(val)
|
Nil |
isnil(val)
|
Name space |
isns(val)
|
Pair |
ispair(val)
|
Procedure |
isprocedure(val)
|
Range |
isrange(val)
|
Record |
isrec(val)
|
Record descriptor |
isrd(val)
|
String |
isstring(val)
|
Table |
istable(val)
|
Vector |
isvector(val)
|
Functions
As in Scheme and other functional languages, Cinquecento functions are first-class values. They may be dynamically created, returned from functions, bound to variables, and stored in containers.
Cinquecento functions are closures over their lexical environment: free variables in the body of a function definition refer to the bindings of those variables contained in the enclosing blocks or in the top-level environment.
The @lambda expression defines new function values. The syntax is:
@lambda formals { statement; ... }
The syntax of formals determines how formal parameters are bound to actual parameters:
- If formals is a list of variable identifiers of the form (v1, v2, ..., vn), then the function accepts n parameters, and each actual parameter is bound to the corresponding variable. Formals may be (), defining a function of zero parameters.
- If formals is a list of identifiers of the form (v1, v2, ..., vn, varg ...) (where the final ... is concrete syntax), then the function is a variable arity function of at least n parameters . The first n actual parameters are bound to the first n variables; n may be zero. The remaining parameters, possibly zero, are passed in a freshly allocated list that is bound to varg.
The body of the function is a sequence of statements to be evaluated when the function is called.
A function call is an expression of the form
fn(expr, ...)
where fn is an expression that evaluates to a function. The number of expressions appearing in the argument list expr, ... must match the number expected by the function.
A return statement in the body of a function returns control to the expression that called the function. The syntax variants are:
return;
return expr;
The first form returns the value nil. The second form returns the value of expr.
The value of a function call expression is the value returned by evaluation of a return statement in the body of the called function, or the value of the last evaluated expression in the body if the function returns without reaching a return statement.
The @define form is syntactic sugar for naming a function definition. For example
@define name(var, ...) { statement; ... }
is equivalent to
name = @lambda(var, ...) { statement; ... }
The @defloc form is like @define, except that it forces the binding to be local to the innermost level of lexical scope.
Multiple values may be returned from a function by returning a list of the values:
@define foo() { return [1, 2, 3]; }
While the value returned is an ordinary list, its elements can be bound directly to variables:
; [x, y, z] = foo();
; printf("%d %d %d\n", x, y, z);
1 2 3
Any number of variables up to the number of elements in the returned list may be bound. Assignment start from the front of the list:
; [x, y] = foo();
; printf("%d %d\n", x, y);
1 2
The identifier _ is reserved for binding values that will not be used:
; [_, _, z] = foo();
; printf("%d\n", z);
3
Only variable identifiers (and not, e.g., nested binding forms) may appear as elements of this binding form. They are subject to the same implicit binding rules as ordinary assignments. As an expression, the value of this binding form is nil.
The built-in function apply and the syntax form @lazyapply are used to call functions.
Control
Cinquecento supports the control flow statements of C, including the loop statements do-while, for, and while, the conditional statements if, if-else, and switch (with case and default labels), and the goto statement.
In all cases but switch, these statements have the same syntax they have in C. The syntax of switch has been expanded to support pattern matching, discussed in the switch statement and pattern matching below, but is backward compatible with C: a switch statement that is legal C will have the same semantics in Cinquecento as in C.
As in C, the integer value 0 is considered false in conditional evaluation. Unlike C, the value nil is also considered false. (The value of !nil is 1.) All other values are considered true.
Control flow through loops and switch statements can be controlled with break and continue as in C. Neither these keywords nor goto can be used to leave or enter the body of a function.
Several functions support additional forms of non-local control flow:
It is currently unsafe to invoke continuations that return to calls to built-in functions.
The switch statement and pattern matching
As in C, the cases of a switch are compared in lexical order; control jumps to the first statement with a matching case label, or to the optional default label, regardless of where it appears, if no case matches. Control skips past the switch body if there is no matching case and no default label.
The syntax and semantics of switch has been expanded from that of C in several ways. First, switch can dispatch on any type, not just integer types, and therefore the expressions in case statements need not be integers. For example:
; x = [1,2]; [ 1, 2 ] ; switch (x) { case [1,2]: printf("match\n"); } match ; switch (x) { case []: break; default: printf("non-match\n"); } non-match
The expression being dispatched (here, x) is compared with the expressions in case statements (here, [1,2] and []) using the == operator, whose semantics depends on the values being compared. Generally speaking, and unlike C, case statements may include expressions, not just constants. For example,
; switch (x) { case (gettimeofday() > 2 ? [1,2] : []) : printf("match\n"); }
match
When a particular switch contains case statements with expressions they are executed in logical order; if a match occurs prior to reaching a particular case, then that case's expression is not executed.
Cinquecento's most liberal extension of switch is in supporting pattern matching, as provided in functional languages like Ocaml and Erlang. In addition to case statements a switch may also contain @match statements, with syntax @match p: s where p is a pattern and s is a statement. A pattern is, roughly speaking, a constant but with the allowance that it may contain variables (including the special don't care variable _, also referred to as the wildcard variable). So [ 1, 2 ] and [ x, 2 ] and [ [x, y], _, 2 ] are patterns, while 1+2 and [ 1, x+2 ] are not. A more general form of matching is permitted using the syntax @match p && e: s where e is an expression that further constrains whether the match should succeed; we discuss this form in detail below.
When pattern matching, variables that occur in patterns correspond to portions of the object being matched against, and can be referred to in the code associated with the @match. This is most easily seen with an example:
; x = [1,2]; [ 1, 2 ] ; switch (x) { @match [1,y] : printf("match: y = %d\n",y); } match: y = 2 ; switch (x) { @match [1,_] : printf("matched\n"); } matched ; switch (x) { @match [2,y] : break; default: printf("no match\n"); } no match
The first switch statement matches the initial clause because there exists a value of y for which the expression matches the value of x, i.e., when y is 2. Thus, y is bound to 2 inside the body of the @match expression. The second pattern matches for the same reason, but in this case the _ is used so nothing is bound in the body of the @match. The third switch fails to match because there is no valuation of y that would permit [2,y] to match x's value of [1,2].
Particulars on matching lists, records and tables are given elsewhere in the manual.
In general, patterns may not contain duplicate variables, though they may contain more than one wildcard. So patterns like [x,x], which you might like to write to match against two-element lists whose elements are the same value, are disallowed. On the other hand, the extended form of pattern matching can be used to recover lost flexibility. For example
; x = [1,1]; [ 1, 1 ] ; switch(x) { @match [y,z] && y == z : printf("match %a %a\n",y,z); } match 1 1 ; switch(x) { @match [y,z] && y != z : break; default: printf("failed\n"); } failed
Pattern variables will shadow variables defined in a surrounding scope, which may be surprising. For example, you might write
; x = [1,2]; [ 1, 2 ] ; y = "hello"; "hello" ; switch (x) { @match y : printf("match, y = %a\n",y); } match, y = [ 1, 2 ] ; y; "hello"
Here, the @match statement in the switch treats y as a binding variable which will match anything, binding y to it. You might have expected the match to fail if you thought y in the @match corresponded to the variable defined in the outer scope. After the switch concludes, references to y refer to the y defined in the outer scope. In effect, you can think of pattern variables as being @local to the @match arm in which they appear.
On the other hand, pattern matching only occurs within @match statements, not case statements. For the latter, references to variables in general expressions will refer to the outer scope whereas for the former they act as binders. For example:
; x = 6; 6 ; switch (6) { case x : printf("match, x = %d\n",x); } match, x = 6 ; switch (9) { case x : break; default: printf("no match\n"); } no match ; switch (9) { @match x : printf("match, x = %d\n",x); } match, x = 9
A normal switch statement permits falling through from one case to the next. In general, falling through into a @match is treated as an error. This is because @match is most useful when binding variables, and in general there is no guarantee those variables will be properly bound when falling through. For example, consider this code:
switch (7) { @match 7 : while (0); @match [x]: printf("x=%d\n",x); }
This code will match the first clause, execute the while (0) and then fall through to the body of the second @match. But this clause refers to the variable x so that executing the second body without having properly bound it makes no sense. As such, Cinquecento will complain with a run-time error:
error: attempt to fall through to a @match error (builtin function) entry (:1)
It is equally dangerous to goto under a @match statement, but this is not something we prevent at the moment. You may fall through from a @match into a case or default.
We do permit fall-through in the following idiomatic situation: when you have multiple @match clauses (with or without fenders) that bind the same (non-wildcard) variables, with all bodies empty but the last. It is also permitted to bind more variables in earlier clauses as long as fallen-to clauses match a subset of those variables. As an example, you can do:
; switch ([2,1]) { @match [x]: @match [x,_]: @match [x,_,_]:
printf("x = %d\n",x); break; }
x = 2
This code will match a list of length one, two, or three, and print its first element.
Overview of Domains
This section summarizes the concepts and terminology of Cinquecento domains.
Domains are Cinquecento objects that are used to represent programs in execution. This representation is based on the idea of providing an interface to program state that is based on the syntax of C programming language. Domains represent three aspects of a program in execution: its memory, symbols, and types.
The domain interface is defined procedurally: like an object in an object-oriented language, a domain encapsulates the implementation of a predefined set of operations.The domain interface can be called directly, but normally Cinquecento programs interact with domains through C expressions, which are translated by the Cinquecento evaluator into implicit calls to the domain interface.
A domain is formed by pairing a Cinquecento address space with a Cinquecento name space. These objects define complementary subsets of the domain interface.
An address space represents the memory of a C program: a mapping of byte addresses to byte values. It encapsulates three functions. Two access functions, get and put, read and write memory. Map returns the accessible address ranges.
The prototypical address space represents the live memory of a running programs, but address spaces can also represent less dynamic content, such as the content of files or strings. More generally, address spaces can be composed of ordinary Cinquecento functions that manage a view of arbitrary synthetic storage.
A name space catalogs the symbols and types of a C program. It encapsulates five functions. Looksym and looktype map symbol and type names to their definitions. Enumsym and enumtype list the symbols and types of the name space. Lookaddr maps an address to the nearest symbol.
Address spaces and name spaces can be shared among domains. For example, domains representing multiple running instances of the same program binary will necessarily have separate address spaces, but may share a common name space.
Domains are supported by two other types of Cinquecento values: ctype and cvalue.
A ctype represents a type that can be defined in a C program. Ctypes both identify C types (e.g., pointer to struct foo), and define C types (e.g., the fields and size of struct foo).
A cvalue represents a typed, scalar value that can be computed in a C program. Every cvalue has an associated ctype that defines its type. Cvalues in Cinquecento are much like rvalues in C: they can be read and written from memory using variable reference and assignment operators, combined with other cvalues using C operators, and cast to other types using casting operators.
Every cvalue has an associated domain that gives meaning to C operations performed on the cvalue. For example, when a cvalue pointer is dereferenced, memory is accessed through the address space of its associated domain. When a cvalue is cast to another type, the new type definition is obtained from the name space of its associated domain.
Cvalues from different domains can be combined without casting in certain sensible cases, such as array indexing. A cvalue also can be explicitly cast into a different domain using the extended cast operator.
Domains, name spaces, and address spaces are first-class values in Cinquecento. Name spaces and address spaces can be constructed from Cinquecento functions that implement their interfaces. In addition, a syntax form, @names, defines new name spaces using an extended form of C declaration syntax.
Programs can extend the domain interface by associating additional functions with name spaces and address spaces to expose platform-specific debugging functionality, such as access to registers, process control, and breakpoint control.
C identifiers
C identifiers (cids) represent identifiers in C programs. They also provide a general-purpose symbol type, similar to the symbol type of Lisp-based languages.
The predicate iscid tests whether a value is a cid.
A cid literal is a single quote followed by any sequence of characters comprising a valid C identifier:
; s = 'identifier; identifier ; iscid(s); 1
The following functions operate on cids.
Cvalues
Cvalues represent scalar typed values of domains, including integers, floats, and pointers to any type. Every cvalue has an associated ctype that represents the type of the value, an associated domain through which operations such as pointer dereference and type conversion are resolved, and an underlying encoding of its value, in the form of an unsigned, 64-bit, 2's complement integer.
The predicate iscvalue tests whether a value is a cvalue.
The syntax @typeof obtains the ctype associated with a cvalue.
Cvalues are the only numeric type in Cinquecento. The arithmetic, logical, and relational operators of C are defined over cvalues, and yield equivalent values. In particular, Cinquecento applies the same integer promotion rules and "usual conversions" that are applied to the operands of unary and binary operators in C to determine a common type for the operation and its result. For example, the relational operators (<, <=, >, >=, ==, !=) are implemented by cvalcmp, which promotes and converts its operands before comparing them.
Functions that operate on cvalues include:
Ranges
A range is an object that represents a contiguous span of locations in an address space as a pair of cvalues: the offset of the beginning of the span, and its length.
The predicate isrange tests whether a value is a range.
The functions that operate on ranges are:
Symbol references
If id is a symbol defined in domain dom, then an expression of the form
dom`id
is a reference to id.
Such a reference may appear anywhere that a value may appear in a C expression (i.e., a C rvalue). The result is a cvalue that represents the value stored at the location (offset) of the symbol. To compute this cvalue, the evaluator:
- Calls looksym(dom,'id) to obtain the definition of the symbol;
- Reads (using the address space get method) consecutive bytes from the address space associated with the domain, starting at the location of the symbol, up to the number of bytes in the size of its type;
- Constructs a new cvalue whose domain is dom, whose type is the type of the symbol, and whose value is the interpretation of the bytes read from the address space as a value of the type of the symbol.
If the symbol for an identifier has a type of the enumeration constant variant, then identifier is an enumeration constant, and the reference is handled differently: instead of causing an access to the address space, a reference to an enumeration constant evaluates to a cvalue representing the value of the constant. Its domain is the referenced domain, and its type is the enumeration type for which the constant was defined.
An enumeration constant may be referenced through a name space rather than a domain. The domain of the resulting cvalue is a freshly allocated domain whose name space is the one being referenced, and whose address space is that of the literal domain.
A symbol reference may also appear anywhere that a location may appear in a C expression (i.e., a C lvalue), such as on the left side of an assignment form:
dom`id = x;
To evaluate this assignment, the evaluator:
- Calls looksym(dom,'id) to obtain the definition of the symbol;
- Constructs an encoding of the value on the right side based on the type of the symbol;
- Writes (using the address space put method) the encoding to the address space associated with the domain at the location of the symbol.
An attempt to assign to a reference to an enumeration constant draws an error.
The Literal Domain
Cinquecento represents arithmetic literals as cvalues from the built-in literal domain litdom. This domain is constructed from the clp64le name space and the abstract address space returned by mknas. Only the size of arithmetic types matters in this domain; pointer size and endianness are irrelevant, because expressions in this domain never reference memory.
Type and Domain Conversion
Conversion is the process of producing a new cvalue from an old one by changing its type or domain. Cinquecento supports the same explicit and implicit type conversion semantics and operations as C, while extending the notion of conversion to domains.
Explicit conversions
The cast operator () performs, as in C, an explicit conversion of the type of a value. Expressions of the form
yield a new cvalue that represents the conversion of the value expr to the type named typename. Typename is a qualified type name (with optional qualifier). If the type name provides a qualifier, then it is resolved in the name space or domain bound to the qualifier. Otherwise it is resolved in the domain of expr.
The cast operator is useful when the name of the type to which the value is to be converted is known statically and can be written in type name syntax. Sometimes, however, the type is determined dynamically, and represented by a ctype. In this case, the conversion can be performed with the extended cast operator {}.
For example, the following conversion expression:
q = (struct dom`T*)p;
can be expressed (more laboriously) with the extended cast operator:
t = looktype(dom, @typename(struct T*)); q = {t}p;
Unlike the cast operator, the extended cast operator expects its type argument to be a defined type, not a type name to be resolved. For example, the following expression draws an error because the type operand to the conversion is undefined:
q = {@typename(struct T*)}p;
As a special case, if the type operand is the name void* with undefined representation, or any alias for that name, and the value being converted has pointer type, then the conversion is performed as if the type operand had the pointer representation of the type of the value being converted.
The extended cast operator also performs domain conversion. If dom is a domain, then the expession
q = {dom}p;
converts p to the domain dom.
Domain conversion always involves an implicit type conversion. The domain into which the value is being converted is searched for a definition of type of the value being converted. The resulting type definition is the type to which the value is converted. It is an error if there is no definition for the type in the new domain.
As a special case, when the domain operand is the literal domain, the type of the operand is stripped of typedef aliases, to ensure that for base types there exists a definition for the type name.
The extended cast operator also accepts a name space or an address space as the operand to which the value is converted. In this case, a new domain is constructed. Its name space (or address space) is the operand of the extended cast operator, and its address space (or name space) is that of the domain of the value being converted.
Ordinarily, the expr operand to the conversion operators is a cvalue. However, these operators also can be applied to a string. In this case, the string is implicitly converted to a pointer, in a freshly allocated domain, to the first byte of the string. For example, in this form:
p is bound to a char* cvalue that points to the first byte of the string "hello, world\n". A new domain is constructed for this pointer with the name space of the literal domain and a fresh mksas address space backed by the string operand.
Implicit conversions
The evaluation of many C operators includes certain implicit conversions of the operands. Cinquecento adheres to the implicit conversions of C, while extending them to domains.
Integer promotion in C is the act of converting a value of one of the smaller integer types (_Bool, char, unsigned char, short, or unsigned short) to a value of type int. It is performed implicitly to the operand of the unary operators +, -, ~, and !, and to each of the operands of the binary operators +, -, *, /, %, &, |, ^ <, >, <=, >=, ==, !=, <<, and >>,
Following integer promotion, the operands to any of the binary operators except << and >> are converted to a common type through a set of rules called the usual conversions. Generally, the usual conversions determine the smallest type (in terms of the size of the set of values it represents) that can represent both values.
For operands of the same domain, the precise rules are based on the usual conversions in C. However, consider addition of two operands whose types are two different typedef aliases for int. What should the type name of the result be? C offers no guidance: in the context of C evaluation rules, the types of rvalues are intangible, so the type name of this result does not matter. To resolve this dilemma, Cinquecento follows the typedef aliases of each of the types until a common type (possibly the C base type) is found.
For operands of two different domains, the operands are first converted to a common domain, and then the usual conversions are applied. The rules for determining the common domain for such mixed-domain expressions are as follows:
-
If one operand is from the literal domain and the other is from some other domain, then the operand from the literal domain is implicitly converted to the domain of the other operand.
-
If the operands are from different non-literal domains, and neither operand has a pointer type, then both operands are converted to the literal domain.
-
If the operands are from different non-literal domains, and one or both operands has a pointer type, then the result depends on the operation. If the operation is addition, then (by definition of pointer addition) only one of the operands has a pointer type. The non-pointer operand is converted to the domain of the pointer operand. If the operation is subtraction, and if only one of the operands has a pointer type, then as with addition the non-pointer operand is converted to the domain of the pointer operand. The other case, subtraction of two pointers from different domains, raises an error. No other binary operations on pointer values are defined, since array indexing expands to pointer addition.
-
All other mixed-domain expressions, including comparison of pointers from different domains, draw an error.
Remember that the above rules are implicitly applied to operands of binary expression. It is always possible to explicitly cast either operand in order to arrange an evaluation not supported by the rules.
C Operators
The operators of C are defined over Cinquecento cvalues and have isomorphic semantics. The operator set includes arithmetic, bitwise, relational, and logical operators, as well as operators for accessing values of C's aggregate data types, and pointer dereference. All of the operators observe the precedence and associativity rules of C.
The binary arithmetic and bitwise operators (+, -, *, /, %, &, |, ^, <<, >> ) operate on cvalues of integer type. Each yields a cvalue of value and type equivalent to the result in C of the same operator on operands of equivalent type and value. Overflow and underflow semantics on arithmetic are preserved, and the C type promotion and conversion rules are applied.
Similarly, the unary operators (+, -, !, ~) operate on cvalues of integer type and yield a cvalue of type and value equivalent to the result in C of the same operator on an operand of equivalent type and value.
The relational operators (<, <=, >, >=, ==, !=) compare cvalues. The result of each operator is a boolean: 0 or 1, represented as a cvalue of type int in the literal domain.
The logical operators (&& and ||) compute logical and and over integers, treated as boolean values, as in C. Evaluation is short circuited, as in C.
Equivalence and other relations
Equivalence is determined by the functions equal, eqv, and eq.
- If u and v are of different types, they are not equal;
- If u is nil and v is nil, they are equal;
- If u and v are both address spaces, cids, domains, file descriptors, name spaces, procedures, record descriptors, or tables, they are equal if and only if they are the same object (pointer equality);
- If u and v are both cvalues, they are equal if and only if they are equalcval;
- If u and v are both lists or vectors, they are equal if and only if they have the same length and each of their corresponding elements are equal;
- If u and v are both pairs, they are equal if and only if both their cars and their cdrs are equal;
- If u and v are both ranges, they are equal if and only if their offsets and lengths are both equalcval;
- If u and v are both records, they are equal if and only if they have the same record descriptor and each of their corresponding fields are equal;
- If u and v are both strings, they are equal if and only if they have the same length and contain the same sequence of bytes;
- If u and v are both ctypes, they are equal if and only if they are equalctype.
- If u is nil and v is nil, they are eq;
- If u and v are both cvalues, they are eq if and only if they are eqvcval;
- Otherwise u and v are eq if and only if they are the same object (pointer equality).
Each equivalence function is associated with a hash function. Each hash function has the property that any two values that are equivalent by the associated equivalence function have the same hash value. This property of hash functions is used in the implementation of tables.
The equivalence operators == and != are defined over all types of Cinquecento values. If the operands are both cvalues, the result is determined by cvalcmp. If the operands are both strings, the result is determined by strcmp. If one operand is a string and the other is a char* or unsigned char* cvalue, then the cvalue is promoted to a string by calling stringof and compared to the string with strcmp. Operands of any other type are compared with equal.
The other relational operators (<, <=, >, >=, !=) are defined on cvalues, strings, and combinations of strings and char* or unsigned char* cvalues in the same manner as the == and != operators. It is an error to apply them to operands of any other type.
Pointers
Pointer-typed cvalues represent typed locations in a domain.
We illustrate the use of pointers with a simple domain dom, defined as follows:
ns = @names c32le { struct T { @0 int id; @4 struct T *next; @8; }; @0 int x; @4 int y; @0 struct T t; }; as = mkzas(1024); dom = mkdom(ns, as);
As in C, one way to obtain a pointer is through the unary reference operator. For example, the form
p = &dom`x;
binds p to a cvalue that represents the location associated with x in dom.
The domain of a pointer-typed cvalue provides the address space in which pointer operators on the cvalue are applied. The operator * dereferences pointer-typed cvalues. Precisely as a C programmer would expect, * accesses the contents of the location in the associated domain, using the pointed-to type to determine the size and encoding of the accessed range. For example, given the above binding of p, the two expressions:
*p dom`x
would yield equivalent cvalues.
Pointer arithmetic is C pointer arithmetic. The following sequence of expressions:
p = &dom`x; p = p+1;
leaves p bound to a pointer cvalue that refers to the location following x in dom. As in C, the amount by which the value of a pointer is incremented is determined by the size of the underlying type of the pointer.
Pointers to aggregate types also work as expected. The following sequence of expressions:
p = &dom`t; p->next = p+1;
assigns the next field of the struct T at location t in dom to point to the beginning of the next logical struct T. Although there is no symbol corresponding to this location, we can, as in C, use pointers to impute types to otherwise untyped locations.
Cinquecento maintains the C pointer/array relationship. The effect of the following sequence of expressions is equivalent to that of the preceding one:
p = &dom`t; p[0].next = p+1;
The adherence to C pointer semantics extends to disregard for pointer safety. The following sequence of expressions:
p = &dom`x; p = p+256;
binds p to a pointer to a location that does not exist in the address space of dom (since we used mkzas to create a fixed 1024-byte address space).
An attempt to dereference a pointer to an unmapped location draws an error. To test whether a range is mapped without risk of error, call ismapped.
The object model
Three types in Cinquecento &mdash names spaces, address spaces, and domains &mdash have object-oriented structure. This section describes the object model.
Every object comprises a set of named functions, called its methods, and a name.
The primary operation on objects is method invocation. Given an object o, the method named f is invoked (called) with arguments arg, ... by the following syntax:
o.f(arg, ...)
A method is an ordinary Cinquecento function, but method invocation is not a completely ordinary function call. In method invocation, one additional argument, the object upon which the method is invoked, is prepended to the argument list arg, .... This argument is conventionally named this in method definitions.
By default it is an error to invoke a method named f on an object that does not define f. However, if the object defines a method named dispatch, then such unresolved invocations are transformed into the following invocation of dispatch:
o.dispatch("f", arg, ...)
The value of a method invocation is the value returned by the resulting function call.
An expression of the form
o.f
evaluates to a function that, when called on arguments arg, ..., is equivalent to the method invocation
o.f(arg, ...)
The name of an object is a string that is typically used by programs to label an object. The system provides functions to access and update object names, but does not interpret names nor care whether they are used. Two objects may have the same name. The default value of an object's name is nil.
Name spaces and address spaces are constructed from tables of methods:
Certain methods are required to be defined by name spaces and address spaces, either as explicit entries in their method tables or as names recognized by a dispatch method. These methods are discussed below in the sections on names spaces and address spaces. Name spaces and address spaces may define any number of additional methods.
A domain object is constructed from a name space and address space object.
Any method defined by either the name space or address space used to construct a domain may be invoked on the domain. Names are resolved in the following order:
- Name space method table;
- Address space method table;
- Name space dispatch method, if defined;
- Address space dispatch method, if defined.
If both the name space and the address space define a dispatch method, only the name space dispatch will ever be used to resolve method invocations through the domain.
The name space and address space used to construct a domain dom can be accessed by the following syntax:
dom.ns /* evaluates to the name space of dom */ dom.as /* evaluates to the address space of dom */
The object model described here was designed to implement the domain abstraction, not for general-purpose object-oriented programming. Plans for future development of Cinquecento include support for a better object system.
Symbols, Fields, Parameters
A C symbol &mdash a named, typed location &mdash is represented in Cinquecento as a triple comprising name (of type string), type (ctype), and optional attribute table (table).
The attribute table maps names (strings) to values. The only attribute consulted by the system is the one named "offset". Its value is a cvalue representing the offset of the first byte of the object named by the symbol from the start of the address space in which it resides. Attribute tables otherwise offer a convenient way for programs to associate arbitrary data with a symbol.
L1 implements symbols using vectors, rather than, say, a record or a dedicated primitive type. This representation is likely to change. To avoid breaking in the future, programs should use the following functions to create and access symbols, and not directly access the vector. In particular, the effect of modifying a symbol vector is undefined. (In what follows, symbol denotes the representation of symbols.)
Two additional C abstractions &mdash the fields of an aggregate ctype (struct or union), and the parameters of a function &mdash are structually analogous to a symbol. Because of the similarity, Cinquecento uses its symbol representation for these two abstractions.
A difference between fields and symbols is that the offset attribute of a field represents the offset of the first byte of the field from the start of the aggregate in which it resides. Parameters are different from symbols in that there is no notion of an offset, and some parameters ("abstract declarators") omit an identifier.
The following functions, based on the symbol functions, operate on fields and parameters (denoted field and param below).
Name spaces
Name spaces are objects that catalog information about a set of C types and symbols. The primitive constructor for a name space is mkns. Higher-level constructors are the syntax @names and the function mknsraw. The predicate isns tests whether a value is a name space.
Every name space implements the following five methods:
The enumtype and looktype methods of every name space must provide definitions for the following types:
char unsigned char _Bool _Complex short unsigned short float double _Complex int unsigned int double long double _Complex long unsigned long long double long long unsigned long long void*
The definition for void* establishes the pointer representation for the name space. The other definitions establish mandatory C base types.
In addition, these methods may define any number of additional tagged aggregated types, tagged enumeration types, and typedefs.
A composite type &mdash a pointer to, an array of, or a function of an already defined type &mdash need not be explicitly defined by enumtype and looktype. Other than the special case of void*, the system will never consult either method for a composite type definition.
Several functions operate on name spaces:
Root Name Spaces
Cinquecento defines several top-level root name spaces that correspond to common C compiler choices. They vary in the size and encoding of integers and pointers.
Root name space | sizeof(void*) | sizeof(long) | Encoding |
---|---|---|---|
c32le | 4 | 4 | little endian |
c32be | 4 | 4 | big endian |
c64le | 4 | 8 | little endian |
c64be | 4 | 8 | big endian |
clp64le | 8 | 8 | little endian |
clp64be | 8 | 8 | big endian |
In addition to the types defined by all name spaces, every root name spaces defines the following convenience typedefs:
int8 uint8 uintptr int16 uint16 int32 uint32 int64 uint64
One function operates on root name spaces.
@names
The @names form is used to construct a name space from a set of type and symbol declarations.
The syntax is:
The result is a fresh name space that contains the types and symbols defined in parent as well as those declared by the decl ... forms in the body. If a decl declaration collides with (defines the same name as) a symbol or type in parent, the new decl definition shadows the parent definition.
A decl comes in one of four declaration forms: aggregate, symbol, enumeration, or typedef. Generally, the syntax of these forms are extensions of the corresponding declaration forms in C.
Aggregate declarations declare a struct or union type. Here is a basic example:
ns = @names c32le { struct foo { @0x0 int field1; @0x4 char *field2; @0x8; }; };
This form constructs (and binds to the variable ns) a new name space derived from the root name space c32le. The body of the @names form contains a single declaration of the aggregate type struct foo. This declaration is syntactically like a C aggregate declaration, except:
- each field declaration begins with an attribute expression of the form @expr that specifies the offset of the field relative to the beginning of the aggregate;
- the declaration ends with a final attribute expression (not associated with any field) that specifies the size of the aggregate.
Here, the declaration says that struct foo has two fields. The field named field1 begins at byte offset 0 from the start of the aggregrate, and has type int. The field named field2 begins at byte offset 4 from the start of the aggregrate, and has type char*. The third and final line of the aggregate declares the overall size of a struct foo to be 8 bytes.
The value of this @names form is a new name space that contains the types and symbols of c32le plus the newly declared type struct foo. The definitions of the types int and char* given to the fields of struct foo are drawn from the parent name space c32le.
An attribute expression is, in general, either an expression that evaluates to a cvalue representing the offset, as in the above example, or an expression that evaluates to a table. In the latter case, the table must include an entry whose key is the string "offset" and whose value is a cvalue specifying the offset. The table may include any number of addition key/value pairs, none of which will consulted by the Cinquecento evaluator, but rather may be used by programs to associate additional information with the aggregate or individual fields.
Internally, the Cinquecento evaluator maintains a separate attribute table for every field as well as for the entire aggregate. If the attribute expression is a cvalue, this value is implicitly promoted to an attribute table by creating a fresh table of one element that maps the string "offset" to the specified cvalue. The following form, for instance, yields a name space that is equivalent to the one given above, except that an additional (silly) attribute flavor is associated with both field2 and the overall aggregate.
ns = @names c32le { struct foo { @0x0 int field1; @[ "offset" : 0x4, "flavor" : "bald" ] int field2; @[ "offset" : 0x8, "flavor" : "all-weather" ]; }; };
Bitfields involve a special attribute expression. Normally the offset attribute of a field is interpreted to be a count of bytes from the beginning of the aggregate to the beginning of the field. If, however, the attribute expression begins with @@ instead of @, then the offset is interpreted as a count of bits from the beginning of the aggregate. In this case, an additional bitfield width, specified like a C bitfield width, is expected, and the type of the resulting field is wrapped in a bitfield ctype.
For example:
ns = @names c32le { struct foo { @0x0 int field1; @@0x4*8 int bitfield : 3; @0x8; }; };
The bitfield attribute expression must be a cvalue expression. An explicit table expression is not permitted.
An aggregate may declare anonymous fields of aggregate type. These interior aggregates may be declared within the containing aggregate, or refer to an aggregate declared elsewhere. Either way, their fields become in effect the fields of the containing aggregate. For example, this sequence of expressions:
ns = @names c32le { struct foo { @0x0 int field1; @0x4 struct { @0x0 int subfield1; @0x4 int subfield2; @0x8; }; @0xc struct bar; @0x10; }; struct bar { @0x0 int subfield3; @0x4; }; }; dom = mkdom(ns, litdom.as); p = (struct foo*){dom}0; printf("%d ", &p->field1); printf("%d ", &p->subfield1); printf("%d ", &p->subfield2); printf("%d\n", &p->subfield3);
yields
0 4 8 12
The result of creating an aggregate with multiple fields of the same name is undefined.
The attribute expression that declares the size of the aggregate is required in every non-empty aggregate declaration. It always appears as the final declaration. In the special case of a zero-sized aggregate containing no fields, the final attribute is permitted but not required:
ns = @names c32le { struct empty { }; };
Explicit offset attributes offer far greater control over the layout of an aggregate type than is possible in C. This flexibility is designed to allow Cinquecento to be independent of variations in C compiler aggregate layout policies, but it also admits some unnatural constructions. Multiple overlapping fields, unaligned fields, fields that extend past the aggregate, fields that begin before (given negative offset values) the aggregate, and arbitrary gaps in between fields, are all permitted. In particular, given this flexibility, the Cinquecento evaluator makes no distinction between a struct aggregate and a union aggregate, other than type name.
Symbol declarations associate a name and type with a location. The syntax for a symbol declaration is like the syntax of a C variable declaration, except that each symbol declaration begins with an attribute expression. For example:
ns = @names c32le { @0x1000 int x; };
The attribute expression of a symbol gives the offset in bytes from the beginning of the address space (address 0) to the first byte of the object named by the symbol. As with attribute expressions of aggregates, the attribute expression for a symbol may be either a cvalue giving the offset, or a table containing a key "offset" paired with an offset value; regardless, the evaluator maintains an attribute table for each symbol.
Here, the declaration associates a symbol named x of type int with the location beginning at offset 0x1000.
In C, an initializer may follow a variable declaration (int x = 5;), but not in Cinquecento. Name spaces project symbolic interpretations over the raw bytes of address spaces; they do not populate them with values.
Enumeration declarations declare an enum type. The syntax for an enumeration declaration is the same as enumeration declaration syntax in C.
ns = @names c32le { enum Rkind { Rundef=0, Ru08le, Ru16le, Rnrep, }; };
The implicit initialization rules of C are followed in Cinquecento. Unless explicitly initialized, the first constant of an enum is initialized to 0. Any subsequent uninitialized constant is set to the result of incrementing the previous constant. The initializer expression may be any expression that yields a cvalue.
The constants of an enumeration declaration may be referenced in a subsequent enumeration constant expression, as well as in the element size expression of an array declaration. For example:
ns = @names c32le { enum Rkind { Rundef, Ru08le, Ru16le, Rnrep, }; enum Qkind { Qundef = Rnrep, Q1, Q2, Qnq, }; struct foo { @0x0 int rs[Rnrep]; @0x4; }; @0x100 int RQ[Rnrep+Qnq]; };
(It is an oversight that enumeration constants may not be referenced in attribute expressions; this will be addressed soon.)
Typedef declarations gives names to arbitrary types. Their syntax follows that of C.
ns = @names c32le { typedef enum Rkind { Rundef, Ru08le, Ru16le, Rnrep, } Rkind; struct foo { @0x0 Rkind r; @0x4 foo_t next; @0x8; }; typedef struct foo* foo_t; };
As illustrated in this example, a typedef name may be referenced before its declaration, unlike C.
A name can be associated with the resulting name space object with setname.
A common use of @names is to include a file of externally generated type and symbol declarations into the @names body:
ns = @names c32le { @include "mydecls.names" };
For instance, the program dwarf2cqct (available on the web), generates declarations suitable for inclusion in the body of a @names form from the DWARF debugging information contained in an ELF executable.
Unfortunately, there is no straightforward way to make the filename in these forms (or in any use of @include) a variable to be evaluated at run time. The idiomatic workaround is to use eval:
@define atnames(parent, file) { @local s, f; s = sprintfa("@lambda(pns) { @names pns { @include \"%s\" }; };", file); f = eval(s); return f(parent); } ns = atnames(c32le, "/tmp/mydecls.names");
The @names form is a high-level syntactic form that is expanded (at compile time) into a series of ctype and symbol definitions that are entered into a pair of tables that map type and symbol names to their definitions. The following function is the low-level constructor that produces a name space from these tables. It is not needed by most programs.
Address spaces
Address spaces are objects that represent sparse byte-addressed storage. The primitive constructor for an address space is mkas. The predicate isas tests whether a value is an address space.
Every address space implements the following four methods:
Several functions create new address spaces with specific forms of backing storage.
Several functions operate on address spaces.
Ctypes
Ctypes represent types that can be named or defined in the type system of C. The predicate isctype tests whether a value is a ctype.
Ctypes are subdivided into 12 variants:
void base pointer function array struct union enum typedef
bitfield enumconst
undefined
The variants in the first row correspond directly to the types that appear in C programs.
The variants in the second row are used to represent, respectively, the types of bitfields of structure and unions types, and the constants of enumeration types.
The undefined variant is used to represent types whose names but not definitions occur in a name space.
Ctypes are one of the more complicated features of Cinquecento. Details follow.
Type syntax
The syntax of C includes type syntax used to name (reference) and declare C types. Cinquecento accepts a slightly modified form of C type syntax.
In C, use of type syntax is limited to certain contexts: declarations of variables and types, and sizeof and cast operations. In Cinquecento, type syntax is likewise limited to certain contexts: as an operand of the forms @containerof, sizeof, @typename, @typeof; as the first operand to the cast operator ( ); and in symbol and type definitions appearing within the body of the @names form.
Depending on the context, one of three forms of type syntax are allowed:
-
Type names refer to types declared
elsewhere. They may appear in symbol and type
definitions within the body of
the @names form, and as
the operand to
the @typename form.
Their syntax is C type name syntax:
int unsigned long void struct foo union bar enum baz foo_t int** foo_t* (*) (struct foo *fp)
-
Qualified type names are type names
that include an additional reference to a domain or name space in
which the definition of the type is to be resolved.
They may appear as an operand of the forms in @containerof,
sizeof,
and @typeof, and as the
first operand to the cast
operator ( ). Their syntax is
an extended type name syntax that includes an optional
identifier and bactick before the type specifier:
ns`int ns`unsigned long ns`void struct ns`foo union ns`bar enum ns`baz ns`foo_t ns`int** ns`foo_t* (*) (struct foo *fp)
Details about this syntax are given in the following section.
-
Type declarations associate a type name with a
definition of the type. They may appear only in
a @names form. Their
syntax is like C type declaration syntax, but extended
to allow type and variable declarations to include
explicit specification of location, layout and encoding
of the defined object. Example:
ns = @names c32le { struct node { @0x0 int x; @0x4 node_t *next; @0x8; }; typedef struct node node_t; @0x800000 node_t *head; };
Details about this syntax are given in the discussion of @names.
Qualified type names
A qualified type name may appear in a subset of the type name contexts: in @containerof, sizeof, @typeof forms, and as the first operand to the cast operator ( ). The meaning of a qualified type name depends on the context, and so is described separately for each context, but the syntax is always the same. For base type names, the qualifier preceeds the name:
dom`int litdom`unsigned long ns`void
For tagged type names, the qualifier precedes the tag:
struct dom`foo union ns`bar enum dom`baz
For names declared with typedef, the qualifier precedes the name:
dom`foo_t
Every other type name is an array, pointer, or function type based on one of the above three forms. Syntactically, the qualifier is part of the type being modified:
dom`int[100] struct dom`foo* dom`foo_t* (*)(int arg)
Qualifiers themselves are variable identifers typically bound to a domain or name space value. (It would be nice if they could be arbitrary expressions, but the parser does not support that.)
Ctypes
In the context of Cinquecento ctypes, a type name is the abstract structure denoted by type name syntax in C. For example, the syntax int* denotes the type name pointer-to-int. Such names are first-class ctype values in Cinquecento:
; t = @typename(int*); int * ; isctype(t); 1 ; printf("%t\n", t); int * ; isptr(t); 1 ; printf("%t\n", subtype(t)); int
This example shows the use of @typename to construct a ctype from the the C syntax int*. The %t format verb returns a printed representation of a ctype in C syntax. (This verb is used by the L1 repl to print ctype values.) Associated with each ctype variant is a predicate (e.g., isptr) that tests whether a value is an instance of the variant, and a set of functions (e.g., subtype) to deconstruct and inspect the type.
The ctype constructor functions, one for each ctype variant, provide an alternative to the special syntax of @typename. Continuing the example:
; u = mkctype_ptr(mkctype_int()); int * ; t == u; 1
A type name determines the abstract structure of a type, but not the concrete encoding of a value of the type in an address space. Here, the name int* alone could very well describe a 64-bit pointer to a 64-bit integer encoded in big-endian order, or a 32-bit pointer to a 32-bit integer encoded in little-endian order, or some other variation. Such ctypes are said to be undefined:
; sizeof(t); error: attempt to determine size of undefined type ; sizeof(subtype(t)); error: attempt to determine size of undefined type
Type definitions are ctypes that extend the abstract structure of type names with concrete information about encoding and substructure. They are created through the use of optional arguments to the ctype constructors:
; u = mkctype_ptr(mkctype_int(), cqct`Ru64le); int * ; sizeof(u); 8
The optional second argument to mkctype_ptr gives the representation of the type. The value cqct`Ru64le is one of several constants defined in the built-in cqct name space. In this example, we have specified that the pointer value is represented as an unsigned 64-bit little-endian integer, a specification sufficient to read or write the value of such a pointer. Representation constants are defined for common integer, pointer, and float encodings (see mkctype_base).
Although type names and type definitions may be related, they are not equivalent ctypes:
; t == u;
0
The function typename maps a type definition to its name, essentially stripping it of all defining information.
; t == typename(u);
1
In the previous definition of u, we created a type definition for the pointer type, but left its subtype undefined:
; u = mkctype_ptr(mkctype_int(), cqct`Ru64le); int * ; sizeof(u); 8 ; sizeof(subtype(u)); error: attempt to determine size of undefined type
This is not a particularly useful type definition. While it provides sufficient information to access the pointer value, it provides no information about how to dereference this pointer, because the type does not define the concrete encoding of the pointed-to int value. We can fix this by adding a representation argument to mkctype_int:
; u = mkctype_ptr(mkctype_int(cqct`Rs64le), cqct`Ru64le); int * ; sizeof(u); 8 ; sizeof(subtype(u)); 8
A ctype is fully defined if it is a type definition whose constituent subtypes, if any, are also fully defined. Of all the preceding definitions of u, only the final one is fully defined.
A type name may be used to look up a type definition in a name space.
Fortunately, most Cinquecento programs do not directly work with these details. Name spaces provide definitions of types, and most types come along for the ride when symbols are referenced.
- void;
- base types with the same representation;
- typedefs with equal identifiers;
- structs, unions, or enums with equal tags;
- pointer or undefined types with equalctype subtypes;
- array types with equal element counts and equalctype subtypes;
- function types with equalctype return types, parameter lists of the same length, and parameters with corresponding equalctype types.
- bitfield types with equal widths and equalctype container types; or
- enumeration constant types with equalctype subtypes.
The following table summarizes the functions that manipulate each of the ctypes variant. The predicates determine the variant of a ctype. Each predicate takes one argument, a ctype, and returns 1 if the predicate is true for that ctype, 0 otherwise. With the exception of issu, isstruct, and isunion, the predicates are mutually exclusive; issu returns true if its argument is structure or union aggregate, isstruct returns true if its argument is a structure, and isunion returns true if its argument is a union. The accessors return attributes of a ctype instance in each category; the constructors construct new instances of each category. Descriptions of the accessors and constructors follow the table.
Type category | Predicate | Accessors | Constructors |
---|---|---|---|
Aggregate |
isstruct issu isunion |
fields suetag susize |
mkctype_struct mkctype_union |
Array | isarray |
arraynelm subtype |
mkctype_array |
Base C type | isbase |
basebase baseid baserep |
mkctype_base mkctype_bool mkctype_char mkctype_short mkctype_int mkctype_long mkctype_vlong mkctype_uchar mkctype_ushort mkctype_uint mkctype_ulong mkctype_uvlong mkctype_float mkctype_double mkctype_ldouble mkctype_complex mkctype_doublex mkctype_ldoublex |
Bitfield | isbitfield |
bitfieldcontainer bitfieldpos bitfieldwidth |
mkctype_bitfield |
Enumeration | isenum |
enumconsts suetag subtype |
mkctype_enum |
Enumeration constant | isenumconst | subtype | mkctype_const |
Function | isfunc |
params rettype |
mkctype_fn |
Pointer | isptr | subtype | mkctype_ptr |
Typedef | istypedef |
typedefid typedeftype |
mkctype_typedef |
Undefined | isundeftype | subtype | mkctype_undef |
Void | isvoid | n/a | mkctype_void |
The following functions are the ctype accessors.
The following functions are the ctype constructors.
Returns a new ctype representing the base C type corresponding to base, which may be any one of the following base type specifiers defined in the cqct name space:
Vbool | _Bool |
Vchar | char |
Vshort | short |
Vint | int |
Vlong | long |
Vvlong | long long |
Vuchar | unsigned char |
Vushort | unsigned short |
Vuint | unsigned int |
Vulong | unsigned long |
Vuvlong | unsigned long long |
Vfloat | float |
Vdouble | double |
Vldouble | long double |
Vcomplex | _Complex |
Vdoublex | double _Complex |
Vldoublex | long double _Complex |
Rep may be any one of the following representation specifiers defined in the cqct name space:
Rundef | undefined representation |
Ru08le | 8-bit unsigned little endian |
Ru16le | 16-bit unsigned little endian |
Ru32le | 32-bit unsigned little endian |
Ru64le | 64-bit unsigned little endian |
Rs08le | 8-bit signed little endian |
Rs16le | 16-bit signed little endian |
Rs32le | 32-bit signed little endian |
Rs64le | 64-bit signed little endian |
Ru08be | 8-bit unsigned big endian |
Ru16be | 16-bit unsigned big endian |
Ru32be | 32-bit unsigned big endian |
Ru64be | 64-bit unsigned big endian |
Rs08be | 8-bit signed big endian |
Rs16be | 16-bit signed big endian |
Rs32be | 32-bit signed big endian |
Rs64be | 64-bit signed big endian |
Rf32 | 32-bit floating point |
Rf64 | 64-bit floating point |
Rf96 | 96-bit floating point |
Rf128 | 128-bit floating point |
Rx64 | 64-bit complex floating point |
Rx128 | 128-bit complex floating point |
Rx192 | 192-bit complex floating point |
Rep defaults to Rundef.
Records
Records allows the creation of new aggregate types with named fields. A record descriptor is a Cinquecento value that define a particular record type; it associates a record type name with the definition of the record type. A record is instance of a record type. It has an associated record descriptor and bindings for the fields defined by the record descriptor.
For example, the expression:
@record myrec { field1, field2, field3 };
defines a new record type, named myrec. The record type has three fields, named field1, field2, and field3. The value of the @record expression is the resulting record descriptor.
In addition, evaluation of the @record expression implicitly defines two functions, named myrec and ismyrec, in the innermost level of lexical scope.
The function ismyrec is a predicate for instances of myrec. Like the type predicates, it accepts a single argument and returns 1 if the argument is an instance of myrec, and 0 otherwise.
The function myrec is the constructor for instances of the myrec type. For example:
x = myrec("abc", 1, @lambda(x){ printf("hi\n")});
binds x to a new instance of the myrec type.
The function myrec expects either zero arguments, in which case the fields of the new instance are bound to nil, or exactly as many arguments as there are fields in the type, in which case the arguments are bound to the fields in the order they were given in the @record definition.
The dot operator provides access to the fields of a record instance:
printf("%a", x.field1); // print x's field1 x.field1 = "bcd"; // update x's field1 x.field2 += 5; // update x's field2 ++x.field2; // update x's field2 x.field3("blah"); // call the value of x's field3
The following routines operate on records and record descriptors.
Record descriptors have an associated format procedure that is implicitly called when the %a conversion of the formatted I/O library routines is applied to an instance of the record type. The format procedure takes a single argument, the record instance, and returns a string representing the formatted record.
The following routines access the format procedure associated with a record descriptor.
Several functions are used by the implementation of the record system.
We support general matching on records following a design similar to matching on lists. As an example:
; @record myrec { field1, field2 }; <rd myrec> ; x = myrec(1,2); <myrec 1031590bc> ; switch (x) { @match myrec(a,2): printf("a = %d\n", a); break; } a = 1
Record patterns match against fields in the order they were defined, and all fields must be present in the pattern. If fields are missing then the pattern fails to match. Alternatively you can match against particular fields by naming those fields; not all fields need to be listed when using this form, and they may be listed in any order. Continuing the previous example:
; switch (x) { @match myrec(field2=2): printf("ok\n"); break; } ok ; switch (x) { @match myrec(field2=2,field1=y): printf("y=%d\n", y); break; } y=1
Be warned that an expression to create a record resembles a function call and so if you mistakenly include a function call in a @match Cinquecento will interpret it as a record pattern and fail to match it.
Container Types
Cinquecento supports three built-in container types: lists, vectors, and tables.
Several functions operate on any type of container.
Containers can be indexed with C array reference operator ([]). The statement
x = C[i];
reads the ith element from container C. Likewise,
C[i] = x;
assigns the ith element to the value bound to x, and the statements
C[i] += 1; C[i]++;
both increment the ith element.
Unlike its use on C arrays, the [] operator on Cinquecento containers is not commutative. In an expression C[i], C must be a container and i a valid index for the container. For tables, a valid index is a key.
This and other syntax are supported by two generic functions on containers:
Lists
Lists are variable-length ordered collections, possibly empty. Lists are mutable, allowing items to be added or deleted from any position, and contiguously indexed from position 0. The predicate islist tests whether a value is a list.
Lists can be deconstructed by pattern matching. For lists, legal patterns p are one of the following:
- The empty list []
- A constant-length list where each list element is itself a pattern. Examples of patterns having this form include [1,2]; [x, 2]; [1, _]; [_,x]; and [1,[x, 2]]. Respectively, the first pattern only matches the constant list [1,2]; the second pattern matches any two-element list whose second element is 2, binding x to the first element; the third pattern matches any two-element list whose first element is 1 (and it ignores the second element); the fourth pattern matches any two-element list, binding the second element to x; and the last pattern matches a two-element list whose first element is 1 and whose second element is itself a two-element list, in this case with the second element being 2, binding the first element to variable x
- A constant-length list where each list element is itself a
pattern, as above, where the last element is a variable followed by
.... In this case, the final variable
matches against the tail of
the list (which could be empty). Thus, to match
against a list of at least length 3, and print the part
of the list beyond the first three elements, you would
do
; x = [1,2,3,4,5]; [ 1, 2, 3, 4, 5 ] ; switch (x) { @match [_,_,_,t...] : printf("match: t = %a\n",t); } match: t = [ 4, 5 ]
Vectors
Vectors are fixed-length ordered collections, possibly empty. Vectors are mutable, allowing items to be changed at any position, and contiguously indexed from position 0. The predicate isvector tests whether a value is a vector.
Tables
Tables store key/value pairs. Keys and values may be any type of Cinquecento value other than nil. The predicate istable tests whether a value is a table.
Every table has a default value. The default value is set by the table constructors mktab, mktabqv, and mktabq, and determines the result of calls to tablook on undefined keys.
Searches tab for a pair whose key is key and returns the associated value.
If tab has no such pair, then the effect and return value of tablook is determined by the default value of the table in the following way.
If the default value is not nil and not a function, then a new pair is inserted into the table whose key is key and whose value is the default value, and the default value is returned.
If the default value is a function, then a new pair is inserted into the table whose key is key and whose value is the result of calling the function on zero arguments, and this result is returned.
Otherwise (the default value is nil), no modification to the table is performed and nil is returned.
We can also match against tables, using patterns of the form [ k:p ] where k must be a constant (containing no variables) and p is an arbitrary pattern. Such a pattern matches if the table being matched against contains the key k which maps to a value that matches against pattern p, in which case any variables in p are bound in the adjacent code. For example:
; x = [1:"hello", 3:4]; <table> ; switch (x) { @match [1:v] : printf("match: v = %a\n",v); } match: v = "hello"
Table patterns can have multiple key-value pairs in them, similarly to lists. Since table patterns do not require exact matches (i.e., a pattern matches if the table contains the specified keys but there could be other key-value pairs in the table) there is no need for the ... pattern for tables. Keys are not permitted to be pattern variables; this would be too inefficient and ambiguous.
Characters
ASCII characters are represented as cvalues of type char or unsigned char.
The following functions operate on characters.
Strings
The Cinquecento string type represents an array of bytes. Cinquecento strings have fixed length (possibly zero), are indexed from position 0, and are not null terminated. Strings are mutable, but do not share storage with other strings. The maximum string length is 264-1. The predicate isstring tests whether a value is a string.
The C relational operators (==, !=, <<, >, <=, >=) are defined over strings. They compare the contents of the strings. There is no way to test whether two string instances are the same object.
The binary C operator + concatenates strings.
The following functions comprise the basic string operations. Note that while some of these functions share names with C standard library functions, they are equivalent to these functions only when their string operands do not contain NUL bytes.
Pairs
A pair is an object that holds references to two other Cinquecento values, called its car and cdr. It is akin to the fundamental cons cell data structure of Lisp-based languages.
Both pairs and the list type can be used to build lists. The primary difference is that a list built from the list type is itself mutable, i.e., it can be updated in place by imperative operations such as append and delete. Lists based on pairs, in contrast, are typically manipulated by functional operations in the style of Lisp lists.
The conventional value for the empty list in Cinquecento is nil. It is an error to call container and list functions such as isempty, length, and listref on a list built from pairs and nil.
Note that the list constructor syntax [...] is used to build instances of the list type, not pair-based lists.
The predicate ispair tests whether a value is a pair.
The functions that operate on pairs are:
A weak pair is a pair whose car is a weak pointer, a pointer that is not followed by the collector. An object that is reachable only through weak pointers will be eventually reclaimed by the collector, at which time its references will be replaced with the value nil. A weak pair is otherwise indistinguishable from a pair. The predicate isweakpair tests whether a value is a weak pair.
I/O
Cinquecento supports I/O backed by regular files, network connections, or user-defined functions. Such I/O resources are represented by an instance of the file descriptor type. The predicate isfd tests whether a value is a file descriptor.
When the storage for a regular file or network connection descriptor is collected, the system implicitly closes the underlying system file resource. (This is not true for descriptors backed by user-defined functions.)
Filesystem Functions
This section documents functions that operate on the file system.
Formatted Output
Cinquecento provides several functions for C-style formatted output to various destinations.
Each function expects a format string argument. The format string may include conversion specifications that are similar (but not identical) to those handled by C stdio library functions. Each conversion specification begins with the % character, optionally followed by zero or more modifiers characters, optionally followed by a decimal width specifier, optionally followed by the . character and a decimal precision specifier, finally followed by a single verb character. The accepted modifier characters are +, -, 0, #, and a single space. The modifier characters and the width and precision specifiers have the same effects they have in C. In addition, the modifiers l, L, h are accepted but have no effect; unlike the C stdio formatter, Cinquecento can determine the size of an integer operand from its type.
The conversions are processed in order; each operates in turn on the remaining values passed to the format function.
The recognized verb characters are:
a | Formats any type of value in an implementation-defined format. |
c | Formats a character value as an ASCII character (if printable) or in octal. |
d,o,x,X,u | Formats an integer or pointer value in signed decimal, octal, lowercase hex, uppercase hex, or unsigned decimal. |
e | Formats a value of an enumerated type as the name of a constant having the same value. If more than one constant has the same value, one constant is picked in an undefined manner. If no constant has the value, the value is formatted as if the d verb had been used. |
p | Formats a pointer value in lowercase hex. |
s,b,B |
Formats strings.
If the value is not a string, then the result of calling stringof on the value is used.
s copies the string value, stopping at the first null byte.
b is like s, except the entire
string is printed, including null bytes.
B is like b, except unprintable
characters are formatted as octal literals and " and \ are escaped.
|
t | Formats a ctype. The value may be in one of two forms: (1) a ctype, in which case it is formatted as a conventional C type specification, or (2) a vector, representing a C symbol, of the form [ T, ID, ... ], where T is a ctype and ID is a string or nil, in which case the symbol is formatted as conventional (possibly abstract) C declaration. |
y | Formats an integer or pointer value as "symbol+offset". Symbol is the name of the symbol returned by a call to the name space lookaddr operation on the value. Offset is the difference between the value and the location of the symbol returned by lookaddr. If the difference is zero, the offset is suppressed. |
The formatted output functions are:
Quotation, Compilation, and Evaluation
L1 provides access to some stages of Cinquecento expression evaluation. The details of these interfaces are likely to change.
If e is a literal Cinquecento expression, then the expressions
eval("e")
compile(parse("e"))()
compile(@quote { e })()
Storage Management
Storage in the L1 implementation of Cinquecento is managed by a generational stop-the-world copying collector. Its design draws on published descriptions of storage management in Chez Scheme [CSUG, BiBOP, Guardians].
User functions related to storage management are:
At the time it is passed to the finalizer, the value v is unreachable but otherwise has no special status. In particular, the finalizer may reintroduce references to the value, e.g. by storing it in a hash table. The finalizer will not be called again for such a reincarnated value; a new call to finalize must be arranged.
Only one finalizer can be associated with a value at a time; finalize replaces the previously registered finalizer, if any. To disassociate a finalizer from a value v, call finalize(v, nil).
When shutting down the last VM (via cqctfreevm), the Cinquecento system performs a final collection and calls finalizers for unreachable values. However, finalizers for objects that are still reachable at shutdown may not be called.
Instrumentation
The function statistics provides access to internal instrumentation of the L1 runtime.
Returns a table containing instrumentation data for the current L1 execution. Each key/value pair of the table is a separate instrumentation report. Keys are cids, identifying the form of instrumentation. The type of the corresponding value depends on the instrumentation. Instrumentation reports include:
exetime | Elapsed wall clock time in microseconds spent in L1 execution, including calls to foreign functions, system calls, collection and other storage management. |
collectime | The portion of exetime, in microseconds, spent in the L1 collector. |
System Functions
This section documents various system-level functions.
"L1 version commit date"
Executes a command in a new process with I/O connected to the calling process. The arguments arg ... form the elements of the argument vector used to create (via execve) the new process image. There is no way to affect its enviornment. An error is raised if the process cannot be created.
By default, popen returns a list of three file descriptors connected (via unix domain sockets in stream mode) to the standard input, standard output, and standard error, respectively, of the new process. Flags is a bitmask of flags used to specify alternative plumbing. (We show internal names for these flags, but these are not defined in any library.) If bit zero is set (PopenNoErr), standard error is redirected to /dev/null, and is not represented in the returned descriptor list. If bit one is set (PopenStderr), standard error is not redirected, but rather is left connected to the standard error of the calling process,. Likewise, if bit two is set (PopenStdout), standard output is left connected to the standard output of the calling process. Finally, if bit three is set (PopenFullDuplex), standard input and output of the new process are connected to the same file descriptor.
The definition of popen is likely to change in the future.
Compression and Hashing
This section documents functions related to compression and hashing.
Standard Libraries
The L1 implementation of Cinquecento comes with several standard libraries stored in the lib/ subdirectory. Since they are included in l1's load path, these libraries are loaded via the command:
In this document, the APIs for many of the provided libraries are specified.
L1 Executable
The L1 executable is a command-line program that provides an interactive read-eval-print-loop interface to Cinquecento. It is built by default when the L1 source distribution is compiled.
The load path
The system load path is a list of paths that is searched in order for any input file that is included with the form:
The load path may be inspected and changed from within Cinquecento with the functions loadpath and setloadpath.
The default load path consists of one path: L1PATH/lib, where L1PATH is an absolute path to the directory containing the L1 executable. L1 attempts to set L1PATH to the directory in which the actual L1 executable resides, even if it was invoked through a symbolic link.
Two L1 command-line options affect the load path. First, each occurrence of a command line option of the form
prepends path to the load path. Each occurrence is processed in the order in which it appears. Second, if the -s option is given, then the default load path is not included in the load path; instead, the load path consists only of paths specified with the -l option.
Two additional mechanisms override the use of L1PATH/lib as the default path:
- When L1 is built, if the preprocessor variable LIBDIR is defined for the compilation of the l1 driver (main.c), then the default load path is initialized to its value, which must be a string containing colon-separated list of directories.
- When L1 is executed, if the environment variable L1LIBPATH is defined, then its value, which must be a colon-separated list of directories, is appended to the path specified by LIBDIR, if any.
If neither LIBDIR (at compile time) nor L1LIBPATH (at run time) is defined, then the default load path is set to the list containing the single directory L1PATH/lib, as described above.
The prelude
Before the first evaluation of a user expression, the L1 evaluator loads the prelude by implicitly evaluating the form
The library distributed with L1 includes the version of prelude.cqct that the evaluator expects to load with this evaluation. This file defines an increasing portion of the default top-level environment of the Cinquecento evaluator, including many functions, name spaces, and other top-level variables described in this manual.
The prelude can be suppressed by passing the -s option to L1.
C Interface
The following functions comprise the C interface to the L1 implementation of Cinquecento. The interface is defined in cqct.h. For an example of how to use the interface, see the file main.c in the L1 implementation, which implements the L1 read-eval-print loop. The functions are described below in roughly the order in which a typical program would call them.
The interface operates on values of several opaque types.
Toplevel | Toplevel environment |
Val | Cinquecento value |
VM | Evaluator for compiled Cinquecento expressions |
Xfd | File descriptor representation |
A Val is pointer to a structure allocated and managed by the L1 storage manager. Its fields should not be accessed. Since the L1 storage manager is based on a copying garbage collector, the value of a Val (i.e., the address of the referenced structure) may change across calls to functions that invoke the L1 evaluator, including cqctcompile, cqctcallfn, cqctcallthunk, and cqctcalleval. The function cqctgcpersist can be called to lock the location in memory of an individual Val, but this operation creates storage and run-time overhead for the storage manager.
A Val can represent any type of Cinquecento value. The Qkind type defines enumeration constants for each valid type:
typedef enum { Qundef = 0, /* the undefined value (private) */ Qnil, /* the nil value */ Qas, /* address space */ Qbox, /* box (private) */ Qcl, /* closure */ Qcode, /* code object (private) */ Qcval, /* cvalue */ Qdom, /* domain */ Qexpr, /* syntax */ Qfd, /* file descriptor */ Qlist, /* list */ Qns, /* name space */ Qpair, /* pair (unused) */ Qrange, /* range */ Qrd, /* record descriptor */ Qrec, /* record */ Qstr, /* string */ Qtab, /* table */ Qvec, /* vector */ Qxtn, /* ctype */ Qnkind /* invalid (# of kinds) */ } Qkind;
Use the Vkind macro to determine the type of a Val:
An Xfd is a structure whose fields define an interface to an open file descriptor. The Cinquecento initializer, cqctinit uses its Xfd parameter to override the default destination of standard I/O.
typedef struct Xfd Xfd; struct Xfd { uint64_t (*read)(Xfd*, char*, uint64_t); uint64_t (*write)(Xfd*, char*, uint64_t); void (*close)(Xfd*); int fd; };
Each function is passed the xfd from which it was referenced. The fd is a convenience for the user; the system does not use its value. The other arguments to read and write are a pointer to the data buffer and the number of bytes to be transferred; these functions should return the number of bytes actually transferred or (uint64_t)-1 on error. The function close is called when the system decides to close the file descriptor. Any of the functions in an Xfd may be NULL; if the system needs to call a NULL function, it becomes a no-op.
typedef enum Cbase { Vundef=0, Vchar, Vshort, Vint, Vlong, Vvlong, Vuchar, Vushort, Vuint, Vulong, Vuvlong, Vfloat, Vdouble, Vlongdouble, Vcomplex, Vdoublex, Vlongdoublex, Vnbase, Vptr = Vnbase, Vvoid, Vnallbase, } Cbase;
Bibliography
[SICP] | Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of Computer Programs (2nd Edition). MIT Press, 1996. |
[CSUG] | R. Kent Dybvig. The Chez Scheme Version 8 User's Guide. Cadence Research Systems, 2009. |
[TSPL] | R. Kent Dybvig. The Scheme Programming Language, 4th edition. MIT Press, 2009 |
[BiBOP] | R. Kent Dybvig, David Eby, and Carl Bruggeman. Don't stop the BiBOP: Flexible and efficient storage management for dynamically typed languages. Indiana University technical report #400, March 1994. |
[Guardians] | R. Kent Dybvig, David Eby, and Carl Bruggeman. Guardians in a generation-based collector. ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, 207-216, June 1993. |
[Little] | Daniel P. Friedman and Matthias Felleisen. The Little Schemer (4th Edition). MIT Press, 1995. |
[H&S] | Samuel P. Harbison III and Guy L. Steele Jr. C: A Reference Manual (5th Edition). Prentice Hall, 2002. |
[K&R] | Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language (2nd Edition). Prentice Hall, 1988. |
Syntax Index
C Interface Index
cqctcallthunk
cqctcompile
cqctcstrnval
cqctcstrnvalshared
cqctcstrval
cqctcstrvalshared
cqctenvbind
cqctenvlook
cqcteval
cqctfaulthook
cqctfini
cqctfreevm
cqctgcpersist
cqctgcprotect
cqctgcenable
cqctgcdisable
cqctgcunpersist
cqctgcunprotect
cqctinit
cqctint8val
cqctint16val
cqctint32val
cqctint64val
cqctinterrupt
cqctlength
cqctlistappend
cqctlistref
cqctlistset
cqctlistvals
cqctmkfd
cqctmklist
cqctmkrange
cqctmkvec
cqctmkvm
cqctrangebeg
cqctrangelen
cqctsprintval
cqctuint8val
cqctuint16val
cqctuint32val
cqctuint64val
cqctvalcbase
cqctvalcstr
cqctvalcstrlen
cqctvalcstrshared
cqctvalint8
cqctvalint16
cqctvalint32
cqctvalint64
cqctvaluint8
cqctvaluint16
cqctvaluint32
cqctvaluint64
cqctvecref
cqctvecset
cqctvecvals
Function Index
access
append
apply
applyk
arraynelm
backtrace
basebase
baseid
baserep
bitfieldcontainer
bitfieldpos
bitfieldwidth
bsearch
callcc
car
cdr
chdir
cid2str
close
cntrget
cntrput
compact
compile
concat
cons
copy
count
cvalcmp
cwd
delete
delq
domof
enumconsts
enumsym (ns method)
enumtype (ns method)
environ
eq
equal
equalctype
equalcval
eqv
eqvctype
eqvcval
errno
error
eval
evalk
exit
fault
fdname
fdopen
fieldattr
fieldid
fieldoff
fields
fieldtype
finalize
foreach
fork
fprintf
fread
gc
gclock
gcstats
gcunlock
get (as method)
getbytes
getenv
getpid
getpeername
getsockname
gettimeofday
hash
hashctype
hashcval
hashqv
hashqv
hashqvctype
hashqvcval
head
index
inflate
inflatezlib
ioctl
isalnum
isalpha
isarray
isas
isascii
isbase
isbitfield
isblank
iscid
iscntrl
isctype
iscvalue
isdigit
isdom
isempty
isenum
isenumconst
isfd
isfunc
isgraph
islist
islower
ismapped (function)
ismapped (as method)
ismember
isnil
isns
isodigit
ispair
isprint
isprocedure
isptr
ispunct
isrange
isrd
isrec
isrinr
isspace
isstring
isstruct
issu
istable
istypedef
isundeftype
isunion
isupper
isvector
isvoid
isweakpair
isxdigit
l1version
length
list
listdel
listins
listref
listset
lookaddr (ns method)
lookfield
looksym (ns method)
looksym (function)
looktype (ns method)
looktype (function)
malloc
map (function)
map (as method)
mapfile
memcpy
memset
mkcid
mkctype_array
mkctype_base
mkctype_bitfield
mkctype_bool
mkctype_char
mkctype_complex
mkctype_const
mkctype_double
mkctype_doublex
mkctype_enum
mkctype_float
mkctype_fn
mkctype_int
mkctype_ldouble
mkctype_ldoublex
mkctype_long
mkctype_ptr
mkctype_short
mkctype_struct
mkctype_typedef
mkctype_uchar
mkctype_uint
mkctype_ulong
mkctype_undef
mkctype_union
mkctype_ushort
mkctype_uvlong
mkctype_vlong
mkctype_void
mkdir
mkdom
mkfd
mkfield
mkguardian
mklist
mkmas
mkmasx
mknas
mkns
mknsraw
mkparam
mkrange
mkrd
mksas
mkstr
mksym
mktab
mktabq
mktabqv
mkvec
mkzas
myrootns
nameof
nsof
nsptr
open
paramid
params
paramtype
parse
pop
popen
postgc
put (as method)
printf
push
putbytes
rangebeg
rangelen
rand
randseed
rdis
rdfields
rdfmt
rdgettab
rdmk
rdname
rdof
rdsetfmt
rdsettab
read
resettop
rettype
reverse
seek
select
setcar
setcdr
setenv
setloadpath
setname
settypedeftype
sha1
slice
sockpair
sort
split
sprintfa
statistics
stringof
strcmp
strlen
strput
strref
strstr
strton
suattr
substr
subtype
suekind
suetag
susize
symattr
symid
symoff
symtype
tabdelete
tabenum
tabinsert
tabkeys
tablook
tabvals
tail
tcpaccept
tcplisten
tcpopen
tolower
toupper
typedefid
typedeftype
typename
uname
unlink
vecref
vecset
vector
waitpid
weakcons
write