Cinquecento Debug Library

The Cinquecento debug Library

Introduction

This manual describes how to use the Cinquecento debug library to examine program executables and program executions.

The debug library is a client of of sctl, a server for controlling program executions and for accessing symbols and other debugging information from program binaries. To use this library you must have the sctl binary installed in your path; source code and documentation for sctl is available at cqctworld.org.

Keep in mind that the syntax and abstractions of the Cinquecento language are intended to support a variety of approaches to programmatic debugging. The debug library represents just one approach — an effective but rather traditional one.

This manual is split into two parts. The first part is a user guide, providing a conceptual overview of how the library is used and illustrations of elementary operation. The second part is a reference manual for the functions and methods provided by the library.

While this manual focuses on debugging program executions, the debug library also includes functions for constructing standalone name spaces served by sctl. Readers seeking that functionality should skip ahead.

User guide

Overview

The debug library is designed to target a spectrum of program executions — from one instance of a single-threaded program, to one or more simultaneously active instances of one or more multi-threaded programs.

This flexibility stems from a Cinquecento object, called a ctl, that represents a thread of execution. Generally there is one ctl for each thread in each program execution being debugged. The debug library maps operations on each ctl to messages that it exchanges with the sctl server. This mapping is performed by a sctlmux, an object that multiplexes a set of ctls to a remote sctl. A typical Cinquecento program that uses the debug library creates one sctlmux to manage any number of ctls.

Ctls are created by launching a new program execution or attaching to an existing one; they are also created when an execution associated with an existing ctl forks a new process or creates a new thread. Ctls are destroyed when the associated thread or process is terminated.

Ideally, we think of a ctl as a sort of slider that can be positioned at any instant along the interval of time in which the corresponding program executes. Wherever it is positioned, the ctl reflects the state of the target execution at that point in time. This state is decomposed into three major logical parts: the context representing the state of the CPU registers, the address space representing the raw contents of memory, and the name space map recording the binaries (executable and libraries) that are mapped into the address space.

In reality, the debug library does not support a seekable model of execution. Instead, each ctl logically advances in lockstep with the corresponding execution. Snapshots provide a mechanism to make a copy of a ctl frozen in time, preserving the past while execution continues.

Each ctl is an interface for setting traps on future execution events, including execution breakpoints, library loading and unloading, system calls, signals, and termination. Associated with each trap is a user-defined handler function that is called when the event occurs. The handler is passed the ctl that corresponds to the thread and instant of time where the event occurred. Execution of the thread is suspended for the duration of the handler function call. During this time, the handler may examine the state of the execution and set or clear traps on future events, but it may not block awaiting a future execution event. Snapshots may be captured to communicate program state to future handler calls.

Each ctl also provides ingredients for constructing Cinquecento domains, yielding symbolic interfaces to the state of the program during calls to event handlers. The address space of the ctl, provided by the mem method, is an image of the contents of memory; modifications affect future execution. The name space map, provided by the nsmap method, provides a separate name space for each binary mapped into the program at the current instant of execution time. Any of these name spaces can be combined with the address space to form a domain. Such domains are valid only at the instant of execution time at which they are created; they should not be referenced across calls to handlers. A commonly used domain is the executable domain, which overlays the name space of the program executable over the address space. Ctls provide a convenience method, named exe, for constructing this domain.

The context, provided by the ctx method, is a pointer into a specially constructed domain that stores a copy of the registers at the current instant of execution time. The setctx method provides a way to change register values.

Although most of the interfaces of the ctl are designed to be independent of the architecture and operating system of the target execution, some dependencies are unavoidable. The dialect of a ctl collects specifications of these platform-specific dependencies into a single record. It includes structure type definitions for the context and signal event information, and enumeration type definitions for the set of valid system calls and signal numbers. The library implicitly uses the dialect for various purposes, but it is avaliable for general use through the ctl dialect method.

Like many event-based systems, the execution of a Cinquecento program that uses the debug library is controlled by an event loop. This event loop is encapsulated in the sctlmux. The sctlmux receives notification of events from sctl servers and dispatches them to their corresponding handlers. It manages the execution state of each ctl, ensuring that the corresponding execution is paused during a call an event handler, and resumed when it returns.

Most programs that use the debug library share a common structure:

Initialize a sctlmux.
Launch or attach to one or more programs, yielding a corresponding ctl for each thread of execution; the executions are paused.
Define and install handlers for various events on the ctls.
Start the sctlmux event loop, resuming the paused executions.

The sctlmux event loop terminates when all ctls have exited.

Getting started

In the following sections, we illustrate the main features of the library through a series of examples. This requires that we have some sample programs to debug. These samples, as well as the Cinquecento examples, are included in the L1 distribution of Cinquecento; see the README in the demo directory for directions on building and running these examples yourself.

Every Cinquecento program that uses the debug library must arrange to load the library by ensuring that @include <debug.cqct> appears somewhere in the program source.

Here is list, our first sample program:

#include <stdio.h>
#include <stdlib.h>

typedef struct Node Node;
struct Node {
	int v;
	Node *next;
};

static Node*
insert(Node *h, int v)
{
	Node *p, *q;

	p = malloc(sizeof(Node));
	p->v = v;
	p->next = 0;
	if(!h)
		return p;
	q = h;
	while(q->next)
		q = q->next;
	q->next = p;
	return h;
}

static void
dowork(int n)
{
	int i;
	Node *head;
	head = 0;
	for(i = 0; i < n; i++)
		head = insert(head, i);
}

int
main(int argc, char *argv[])
{
	int m;
	m = 5;
	if(argc > 1)
		m = atoi(argv[1]);
	dowork(m);
	return 0;
}

This program features a pointer-based linked list data type called Node, a function called insert that appends new elements to an existing list, a function called dowork that builds a list of consecutive values, and a main function that begins the work. It accepts an optional argument that specifies the number of elements in the final list; the default is five. The program has no output.

Here is a Cinquecento function that launches this program and waits for it to exit:

@define getstarted()
{
	@local mux;
	mux = mksctlmux();
	ctllaunch(mux, ["./list"]);
	mux.run();
}

The first line of getstarted (after the @local declaration) creates a new sctlmux object, binding it to the variable mux. Mux represents a connection to a fresh sctl server that is launched by mksctlmux on the same host as the Cinquecento program.

The second line of getstarted launches our sample program. Ctllaunch directs the sctl associated with mux to create a new process. The second argument specifies the command this new process should execute, in a manner similar to the list of strings passed to execve on Unix systems. In this case, we are launching the list binary from the current working directory, passing it no arguments.

Ctllaunch returns a ctl representing the new program execution. We conventionally bind this value to a variable named ctl, but since this example does not do anything with this value, we have left it out.

Ctllaunch leaves the new program execution in a stopped state, paused just before execution of its first instruction. Many programs perform a series of initialization functions, such as initializing the dynamic loader, loading and initializing library dependencies, and so on, before reaching the first instruction of their main function. Ctllaunch pauses the new execution truly at the first instruction, before any of this initialization.

The third line of getstarted starts the execution by starting the sctlmux event loop. The run method tells the mux to resume the program launched by the ctllaunch and to block until it has exited. If we had launched more than one program with additional calls to ctllaunch, these would all be resumed simultaneously, and the call to run would block until they all exited. If we had registered any handlers for executions events, as we will do in all of the following examples, the sctlmux would dispatch these events in the dynamic extent of the call to run.

; getstarted();
;

Many debugging sessions focus on the code and data of the program executable (rather than its libraries). We say the executable domain is a domain whose address space is the process memory and whose name space represents the symbols and types of the executable. The exe method of a ctl object returns a copy of executable domain for the current state of the associated execution.

To reduce boilerplate code in subsequent examples, we will use the following function to launch our sample programs:

@define start(cmd)
{
	@local mux, ctl, exe;
	mux = mksctlmux();
	ctl = ctllaunch(mux, cmd);
	exe = ctl.exe();
	return [mux, ctl, exe];
}

This function:

receives the argument list to pass to ctlaunch as an argument,
binds the ctl returned from ctllaunch to a variable,
obtains the executable domain from the ctl, and
returns three values: the sctlmux, the ctl, and the executable domain.

Setting breakpoints

Traps are set with the trap method of the ctl object. The first argument is a cid that identifies the kind of trap to set. Depending on the kind, additional arguments may follow to specify parameters of the trap. The final argument is a handler function to be called when the event occurs.

The identifier for breakpoints is brk. There is one argument, the address at which to set the breakpoint. Here the function brkentry sets breakpoint traps on the entry point of three functions in list.

@define brkentry()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);
	ctl.trap('brk, &exe`main,
		 @lambda(ctl) { printf("enter main\n"); });
	ctl.trap('brk, &exe`dowork,
		 @lambda(ctl) { printf("enter dowork\n"); });
	ctl.trap('brk, &exe`insert,
		 @lambda(ctl) { printf("enter insert\n"); });
	mux.run();
}

The expression &exe`main and the others like it use the executable domain to obtain the address of functions in the executable. The handlers are called as execution reaches these addresses.

; brkentry();
enter main
enter dowork
enter insert
enter insert
enter insert
enter insert
enter insert
;

As illustrated in the output, brkentry returns only after the call to the run method of mux, which in turn returns when the target program execution terminates.

The function that handles the breakpoint receives as its argument the ctl representing the execution in which the event occurred. Handlers can use this argument to access the state of the execution at the moment of the event.

Accessing registers

The current values of the registers are available in an value called a context, which is returned by the ctx method of a ctl.

A context is a pointer to a value of type Ctx, a typedef for a structure type that defines fields for each machine register, plus generic aliases for the program counter (pc), stack pointer (sp), and frame pointer (fp) registers.

@define ctx1()
{
	@local ctl, ctx;
	[_,ctl,_] = start(["./list"]);
	ctx = ctl.ctx();
	printf("pc = %p, sp = %p, fp = %p\n",
	       ctx->pc, ctx->sp, ctx->fp);
}

; ctx1();
pc = 2aaaaaaabaf0, sp = 7fffffffe9e0, fp = 0
;

For each ctl, the debugging library automatically selects a definition of Ctx suitable for the target machine, along with other machine and system dependencies, such as the mapping from system call number to name. All of these dependencies are captured in the dialect of the ctl, which is returned by its dialect method.

The dialect record includes a convenience function called dumpctx that prints the names and values of commonly referenced machine-specific fields of a Ctx. Here is usage on a 64-bit x86 target.

@define ctx2()
{
	@local ctl, ctx;
	[_,ctl,_] = start(["./list"]);
	ctx = ctl.ctx();
	ctl.dialect().dumpctx(ctx);
}

; ctx2();
	  rip  00002aaaaaaabaf0
	  rsp  00007fffffffe9e0
	  rbp  0000000000000000
	  rax  0000000000000000
	  rbx  0000000000000000
	  rcx  0000000000000000
	  rdx  0000000000000000
	  rsi  0000000000000000
	  rdi  0000000000000000
	   r8  0000000000000000
	   r9  0000000000000000
	  r10  0000000000000000
	  r11  0000000000000200
	  r12  0000000000000000
	  r13  0000000000000000
	  r14  0000000000000000
	  r15  0000000000000000
	flags  0000000000000200
;

Ctx is just an ordinary ctype. Its full definition can be revealed with the printtype library function.

@define ctx3()
{
	@local ctl, ctx;
	[_,ctl,_] = start(["./list"]);
	ctx = ctl.ctx();
	printtype(typedeftype(@typeof(*ctx)));
}

; ctx3();
struct Ctx {
	/*@0x0*/	uint64 r15;
	/*@0x8*/	uint64 r14;
	/*@0x10*/	uint64 r13;
	/*@0x18*/	uint64 r12;
	/*@0x20*/	uint64 rbp;
	/*@0x20*/	uint64 fp;
	/*@0x28*/	uint64 rbx;
	/*@0x30*/	uint64 r11;
	/*@0x38*/	uint64 r10;
	/*@0x40*/	uint64 r9;
	/*@0x48*/	uint64 r8;
	/*@0x50*/	uint64 rax;
	/*@0x58*/	uint64 rcx;
	/*@0x60*/	uint64 rdx;
	/*@0x68*/	uint64 rsi;
	/*@0x70*/	uint64 rdi;
	/*@0x78*/	uint64 orig_rax;
	/*@0x80*/	uint64 rip;
	/*@0x80*/	uint64 pc;
	/*@0x88*/	uint64 cs;
	/*@0x90*/	uint64 eflags;
	/*@0x98*/	uint64 rsp;
	/*@0x98*/	uint64 sp;
	/*@0xa0*/	uint64 ss;
	/*@0xa8*/	uint64 fs_base;
	/*@0xb0*/	uint64 gs_base;
	/*@0xb8*/	uint64 ds;
	/*@0xc0*/	uint64 es;
	/*@0xc8*/	uint64 fs;
	/*@0xd0*/	uint64 gs;
	/*@0xd8;*/
};
;

Each context returned by ctx is stored in a freshly allocated address space large enough to hold a copy of the register values. Updates to these values, made for example by assigning to the fields of the context structure, are not automatically written back to the registers in the target execution. Use the setctx method to propagate an updated context to the target execution.

Accessing function arguments

The debugging library does not know the calling convention of target program functions. One primitive way to access function arguments is to directly apply knowledge of the convention. We illustrate a less primitive approach later.

A common calling convention on 64-bit x86 Unix systems is to place the first six integer arguments into registers rdi, rsi, rdx, rcx, r8, and r9. Here, assuming this convention, we print the two arguments passed to each call to insert.

@define args1()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctx;
		ctx = ctl.ctx();
		printf("enter insert(0x%p, %d)\n", ctx->rdi, ctx->rsi);
	}

	ctl.trap('brk, &exe`insert, handler);
	mux.run();
}

; args1();
enter insert(0x0, 0)
enter insert(0x602010, 1)
enter insert(0x602010, 2)
enter insert(0x602010, 3)
enter insert(0x602010, 4)
;

The first argument to insert is a pointer to the head of the list being constructed by dowork; the second is the value to be inserted.

Examining data structures

We can dereference the first argument of insert to show the contents of the list.

@define args2()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctx, exe, h;

		ctx = ctl.ctx();
		exe = ctl.exe();
		printf("enter insert(0x%p, %d)\n", ctx->rdi, ctx->rsi);
		h = (Node*){exe}ctx->rdi;
		showlist(h);
	}

	ctl.trap('brk, &exe`insert, handler);
	mux.run();
}

; args2();
enter insert(0x0, 0)
	list =
enter insert(0x602010, 1)
	list = 0
enter insert(0x602010, 2)
	list = 0 1
enter insert(0x602010, 3)
	list = 0 1 2
enter insert(0x602010, 4)
	list = 0 1 2 3
;

The new function showlist prints a list given a pointer to the first element. But the pointer we passed to this function is not simply the value of rdi from the context. Instead, we perform two casting operations to this pointer before passing it to showlist:

		h = (Node*){exe}ctx->rdi;
		showlist(h);

Reading right-to-left, the first is a use of the extended cast operator to cast the pointer value into the exe domain. Recall that every context is stored in a freshly allocated address space, never the address space of the target execution. We cannot meaningfully dereference rdi in this address space. Casting it to the executable domain allows us to dereference the value in the address space of the target execution.

The second cast is a use of the ordinary C cast operator to change the type of the rdi value to Node*. As we saw in ctx3, the type of the register in a context field is an integer type, not a pointer type. We need a pointer type in order to dereference. In addition, by casting to Node* type, the pointer type is actually the right type for examining the fields of the structure.

The order of these cast operations is significant. The definition of the type Node* that we wanted is the one from the executable domain. When performing a type cast, Cinquecento determines the definition of the type from the domain of the value. Had we switched the order of the cast operations, we would have unsuccessfully attempted to look up the definition of Node* from the domain of the context value.

The above example has what might seem like a redundant construction of the executable domain. We already computed the executable domain with the start call; why not use it in the handler? Each domain represents a specific instant in execution time. The executable domain returned from start represents the executable domain at the start of the execution. The handler function, in contrast, is called each time the insert function is called — each a different instant of execution time.

Trapping function return

Trapping the return of a function call is trickier than trapping the call. We can tell that a call to a function has occurred by trapping execution of the instruction at its entry point, which generally happens to be the address of the function given by the symbol table. However, a given function can potentially return from any one of multiple return instructions in its code. Symbol tables typically do not expose any of these locations. How can we tell that a call has returned?

Our approach is to place a single trap at the address where the call will eventually return, i.e., the return address in the caller. We cannot determine this address statically; it is a dynamic value that depends on the caller. A convenient point at which we can determine this value is at the entry to each call, where we can apply our knowledge of the calling convention to read the return address set by the caller.

Most x86 programs use the call instruction to transfer control to the entry point of a function. This instruction pushes the return address onto the stack in conjunction with the control transfer. Since the stack pointer points to the most recently pushed value, we can read the return address by dereferencing the stack pointer. We can then trap the return event by placing a breakpoint on the return address. This should be a one-shot breakpoint that we clear after it fires, since the next return event may be from a different call site that we can discover only by repeating this process.

Here we extend args2 to show how the list changes across each call to insert.

@define brkreturn()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctx, exe, sp, id;
		ctx = ctl.ctx();
		exe = ctl.exe();
		printf("enter insert(0x%p, %d)\n", ctx->rdi, ctx->rsi);
		showlist((Node*){exe}ctx->rdi);
		sp = (void**){exe}ctx->sp;
		id = ctl.trap('brk, *sp,
			      @lambda(rctl){
				      @local ctx, exe;
				      ctl.clear(id);
				      ctx = rctl.ctx();
				      exe = rctl.exe();
				      printf("leave insert\n");
				      showlist((Node*){exe}ctx->rax);
			      });
	}

	ctl.trap('brk, &exe`insert, handler);
	mux.run();
}

; brkreturn();
enter insert(0x0, 0)
	list =
leave insert
	list = 0
enter insert(0x602010, 1)
	list = 0
leave insert
	list = 0 1
enter insert(0x602010, 2)
	list = 0 1
leave insert
	list = 0 1 2
enter insert(0x602010, 3)
	list = 0 1 2
leave insert
	list = 0 1 2 3
enter insert(0x602010, 4)
	list = 0 1 2 3
leave insert
	list = 0 1 2 3 4
;

This example demonstrates several techniques which in combination trap the return of each call to insert. We set a trap on the entry to insert. The handler reads the return address by casting the stack pointer into the executable domain. Since this dereference should yield an address, the stack pointer is cast to void**. We then set a breakpoint on the return address. To make this breakpoint a one-shot breakpoint, we capture the trap identifier returned by the trap method. In the handler for the trap, we clear the trap so that it does not fire again without a corresponding new call to the function. Finally, we print the return value of the function, again by applying our knowledge of the calling convention, specifically that integer return values are placed in rax.

Unfortunately, this approach is not robust. It can be defeated in at least two scenarios:

Recursion. If multiple calls to the function occur from the same call site before any of them return, then the above technique will accumulate a one-shot return address trap for each of the calls, and trigger all of them upon the first return.
Non-standard calling convention. The technique assumes that we know where the return address is stored upon function entry, and that we know where the return value is stored upon function return.

The debug library makes no attempt to address these limitations.

Helpers for call and return

The preceding examples illustrate a tedious and primitive approach to the frequently performed task of trapping function call and return events.

The debug library provides three convenience functions that simplify function call and return event programming. These functions have three advantages over the previous approach:

Arguments and return values are made directly available to handler functions. They have the correct type and, for pointer values, may be dereferenced.
The user is not required to remember how the calling convention works.
For return events, they handle the setting and clearing of the one-shot breakpoint.

These functions take advantage of parameter and return value type information recorded in the name space in which the function being trapped is described. If this type information is not available, these functions will not expose the arguments or return value of the call event. Similarly, these functions assume that the function observes the commonly used calling convention for the platform; they will misbehave if it is does not.

The function brk sets a trap for call events. Its arguments are the ctl to instrument, the address of a function, and a handler function. When the call event occurs, the arguments passed to the handler function are the ctl, followed by a copy of each of the arguments that were passed to the trapped function, if any. Each argument has the same type as the corresponding parameter in the function prototype, and each argument is cast into the domain of the execution, allowing it to be directly dereferenced. The return value of the handler is ignored.

@define brkexample()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	brk(ctl, &exe`insert,
	    @lambda(ctl, list, val){
		    printf("enter insert(0x%p, %d)\n", list, val);
		    showlist(list);
	    });

	mux.run();
}

; brkexample();
enter insert(0x0, 0)
	list =
enter insert(0x602010, 1)
	list = 0
enter insert(0x602010, 2)
	list = 0 1
enter insert(0x602010, 3)
	list = 0 1 2
enter insert(0x602010, 4)
	list = 0 1 2 3
;

The function rbrk sets a trap for return events. Its arguments are the ctl to instrument, the address of the function, and a handler function. When the return event occurs, the arguments passed to the handler function are the ctl, followed by a copy of the return value of the trapped function, if any. The return value of the handler is ignored.

@define rbrkexample()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	rbrk(ctl, &exe`insert,
	     @lambda(ctl, rv){
		     printf("leave insert\n");
		     showlist(rv);
	     });

	mux.run();
}

; rbrkexample();
leave insert
	list = 0
leave insert
	list = 0 1
leave insert
	list = 0 1 2
leave insert
	list = 0 1 2 3
leave insert
	list = 0 1 2 3 4
;

The function brbrk combines the functionality of brk and rbrk. Its arguments are the ctl to instrument, the address of the function, and a handler function for the call event. This handler is called in the same manner as the brk handler, except that it may return a new handler that, when the trapped function call returns, will be called in the same manner as the rbrk handler. If the call event handler instead returns nil, then the return event is ignored.

@define brbrkexample()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	brbrk(ctl, &exe`insert,
	      @lambda(ctl, list, val){
		      printf("enter insert(0x%p, %d)\n", list, val);
		      showlist(list);
		      return @lambda(ctl, rv){
			      printf("leave insert\n");
			      showlist(rv);
		      };
	      });

	mux.run();
}

; brbrkexample();
enter insert(0x0, 0)
	list =
leave insert
	list = 0
enter insert(0x602010, 1)
	list = 0
leave insert
	list = 0 1
enter insert(0x602010, 2)
	list = 0 1
leave insert
	list = 0 1 2
enter insert(0x602010, 3)
	list = 0 1 2
leave insert
	list = 0 1 2 3
enter insert(0x602010, 4)
	list = 0 1 2 3
leave insert
	list = 0 1 2 3 4
;

Unwinding the stack

The unwind method of a ctl unwinds the call stack. It returns a list of contexts, one for each stack frame, each representing the state of the registers at the time of the corresponding call. This state includes the return address, stack pointer, and frame pointer of the corresponding call. If sufficient debugging information is available (such as DWARF unwind tables), the contexts may also reflect the other saved registers suitable for recovering the values of local variables stored in registers.

The convenience function fmtframe converts a call frame context into a textual representation of the call site, including the program counter value, the name of and offset within the caller, and the file name and line number of the call site in source code.

@define unwind()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctxs;
		printf("enter insert\n");
		ctxs = ctl.unwind();
		foreach(@lambda(ctx){ printf("%s\n", fmtframe(ctl, ctx)); },
			ctxs);
		printf("\n");
	}

	ctl.trap('brk, &exe`insert, handler);
	mux.run();
}

; unwind();
enter insert
0000000000400564	insert                        	/u/vic/src/l1/demo/list.c:12
0000000000400604	dowork+0x2c                   	/u/vic/src/l1/demo/list.c:34
0000000000400652	main+0x3b                     	/u/vic/src/l1/demo/list.c:44
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

enter insert
0000000000400564	insert                        	/u/vic/src/l1/demo/list.c:12
0000000000400604	dowork+0x2c                   	/u/vic/src/l1/demo/list.c:34
0000000000400652	main+0x3b                     	/u/vic/src/l1/demo/list.c:44
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

enter insert
0000000000400564	insert                        	/u/vic/src/l1/demo/list.c:12
0000000000400604	dowork+0x2c                   	/u/vic/src/l1/demo/list.c:34
0000000000400652	main+0x3b                     	/u/vic/src/l1/demo/list.c:44
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

enter insert
0000000000400564	insert                        	/u/vic/src/l1/demo/list.c:12
0000000000400604	dowork+0x2c                   	/u/vic/src/l1/demo/list.c:34
0000000000400652	main+0x3b                     	/u/vic/src/l1/demo/list.c:44
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

enter insert
0000000000400564	insert                        	/u/vic/src/l1/demo/list.c:12
0000000000400604	dowork+0x2c                   	/u/vic/src/l1/demo/list.c:34
0000000000400652	main+0x3b                     	/u/vic/src/l1/demo/list.c:44
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

;

Accessing local variables

Every valid context corresponds to a point in program execution in which a certain set of local variables are defined. Given a valid context, the ctl method ldom makes available the names, types, location, and values of its associated local variables as a dynamically constructed local domain. The local domain is like any other domain, except that its name space is limited to local variables.

The contexts returned by unwind are each suitable arguments for ldom. However, ldom depends on the presence of sufficient debugging information in the binary, both to ensure that the context includes saved register values, as well as information about local variable names, types, and locations.

Here we trap calls to the entry of insert, use unwind information to obtain a context and local domain for its caller dowork, and display the local variables in the dowork context.

@define locals()
{
	@local mux, ctl, exe;
	[mux,ctl,exe] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctx, ctxs, ldom;
		ctx = ctl.ctx();
		printf("enter insert(0x%p, %d)\n", ctx->rdi, ctx->rsi);
		ctxs = ctl.unwind();
		ctx = ctxs[1]; /* context for "dowork" */
		ldom = ctl.ldom(ctx);
		printf("\tcalled from dowork: i=%d, head=0x%p, n=%d\n",
		       ldom`i, ldom`head, ldom`n);
	}

	ctl.trap('brk, &exe`insert, handler);
	mux.run();
}

; locals();
enter insert(0x0, 0)
	called from dowork: i=0, head=0x0, n=5
enter insert(0x602010, 1)
	called from dowork: i=1, head=0x602010, n=5
enter insert(0x602010, 2)
	called from dowork: i=2, head=0x602010, n=5
enter insert(0x602010, 3)
	called from dowork: i=3, head=0x602010, n=5
enter insert(0x602010, 4)
	called from dowork: i=4, head=0x602010, n=5
;

Trapping system calls

System call entry and exit events can be trapped with the syscall trap kind.

The dialect includes an enum type named Syscall that defines an enumeration constant for each valid system call number on the target execution system.

The syscalls example traps each system call event in the execution of list, and prints information about each system call entry. It uses Syscall to obtain a name for each system call.

@define syscalls()
{
	@local mux, ctl, enter;
	[mux,ctl,_] = start(["./list"]);

	@defloc handler(ctl)
	{
		@local ctx;
		ctx = ctl.ctx();
		
		if(enter)
			printf("enter syscall %e\n",
			       {ctl.dialect().Syscall}ctx->orig_rax);
		enter = !enter;
	}

	enter = 1;
	ctl.trap('syscall, handler);
	mux.run();
}

; syscalls();
enter syscall brk
enter syscall access
enter syscall mmap
enter syscall access
enter syscall open
enter syscall fstat
enter syscall mmap
enter syscall close
enter syscall access
enter syscall open
enter syscall read
enter syscall fstat
enter syscall mmap
enter syscall mprotect
enter syscall mmap
enter syscall mmap
enter syscall close
enter syscall mmap
enter syscall mmap
enter syscall arch_prctl
enter syscall mprotect
enter syscall mprotect
enter syscall mprotect
enter syscall munmap
enter syscall brk
enter syscall brk
enter syscall exit_group
;

The library provides no assistance for distinguishing system call entry from exit. If the trap is set at the beginning of execution, or from within a trap handler for another user-level event such as a breakpoint, then the next system call event will always be a system call entry.

The library also provides no assistance for retrieving the arguments to the system call. Users are expected to know the system call calling convention.

Trapping fork, exec, and exit

The Unix system calls fork, exec, and exit can be trapped using the ordinary system call trap. However, the library also provides separate traps for these relatively complex events.

We use a new sample program, forkexec, to illustrate these traps.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int
main(int argc, char *argv[])
{

	if(argc < 2){
		printf("usage: %s command [arguments]\n", argv[0]);
		exit(1);
	}
	
	switch(fork()){
	case -1:
		printf("fork: %s", strerror(errno));
		exit(1);
	case 0:
		execv(argv[1], argv+1);
		printf("exec: %s", strerror(errno));
		exit(0);
	default:
		wait(0);
		exit(0);
	}
}

The command-line argument to forkexec is the name and arguments of a program to run. Forkexec forks, runs the requested program the child process, and waits for the child to exit. We will use it to launch the list program.

The trapfork example illustrates how to trap fork events. The trap kind is fork. The trap handler fires whenever a fork event occurs. The handler arguments are the ctl corresponding to the execution that forked, and a new ctl, conventionally named spwn, corresponding to the spawned execution.

@define trapfork()
{
	@local mux, ctl;
	[mux,ctl,_] = start(["./forkexec", "./list"]);

	@defloc onfork(ctl, spwn)
	{
		printf("ctl %d has forked ctl %d\n", ctl.id(), spwn.id());
	}

	ctl.trap('fork, onfork);
	mux.run();
}

; trapfork();
ctl 17807 has forked ctl 17814
;

Next, the example trapforkexec extends trapfork with a trap on exec events.

@define trapforkexec()
{
	@local mux, ctl;
	[mux,ctl,_] = start(["./forkexec", "./list"]);

	@defloc	onfork(ctl, spwn)
	{
		printf("ctl %d has forked ctl %d\n", ctl.id(), spwn.id());
	}

	@defloc onexec(ctl)
	{
		@local exe;
		exe = ctl.exe();
		printf("ctl %d has exec'ed %s\n", ctl.id(),
		       ctl.nsmap().nspath(exe.ns));
	}

	ctl.trap('fork, onfork);
	ctl.trap('exec, onexec);

	mux.run();
}

; trapforkexec();
ctl 17836 has forked ctl 17843
ctl 17843 has exec'ed /u/vic/src/l1/demo/list
;

All traps set on a ctl are automatically propagated to its children. Here we see that the exec trap was set on the original forkexec execution, but then fired on the ctl corresponding to its forked child.

To obtain the name of the newly executed program, the onexec handler queries the nsmap for the file name corresponding to the name space of the executable domain.

Since the exec system call replaces the code of an execution all breakpoint traps for the corresponding ctl are automatically cleared following a successful call to exec. All other traps remain set.

Finally, the trapforkexecexit example extends trapforkexec to illustrate the trapping of exit events. As an added twist, we modify the our launch arguments to execute one extra level of forkexec.

@define trapforkexecexit()
{
	@local mux, ctl;
	[mux,ctl,_] = start(["./forkexec", "./forkexec", "./list"]);

	@defloc	onfork(ctl, spwn)
	{
		printf("ctl %d has forked ctl %d\n", ctl.id(), spwn.id());
	}

	@defloc onexec(ctl)
	{
		@local exe;
		exe = ctl.exe();
		printf("ctl %d has exec'ed %s\n", ctl.id(),
		       ctl.nsmap().nspath(exe.ns));
	}

	@defloc onexit(ctl, status)
	{
		printf("ctl %d has exited with status %d\n", ctl.id(), status);
		showstack(ctl);
	}

	ctl.trap('fork, onfork);
	ctl.trap('exec, onexec);
	ctl.trap('exit, onexit);

	mux.run();
}

; trapforkexecexit();
ctl 17865 has forked ctl 17872
ctl 17872 has exec'ed /u/vic/src/l1/demo/forkexec
ctl 17872 has forked ctl 17885
ctl 17885 has exec'ed /u/vic/src/l1/demo/list
ctl 17885 has exited with status 0
00002aaaaad77a28	_exit+0x38                    
00002aaaaad06214	exit+0x94                     
00002aaaaacebc53	__libc_start_main+0x103       
00000000004004a8	_start+0x28                   

ctl 17872 has exited with status 0
00002aaaaad77a28	_exit+0x38                    
00002aaaaad06214	exit+0x94                     
00000000004007d3	main+0xdf                     	/u/vic/src/l1/demo/forkexec.c:28
00002aaaaacebc4c	__libc_start_main+0xfc        
0000000000400638	_start+0x28                   

ctl 17865 has exited with status 0
00002aaaaad77a28	_exit+0x38                    
00002aaaaad06214	exit+0x94                     
00000000004007d3	main+0xdf                     	/u/vic/src/l1/demo/forkexec.c:28
00002aaaaacebc4c	__libc_start_main+0xfc        
0000000000400638	_start+0x28                   

;

The handler for exit events receives the ctl that is exiting and the operating system's integer exit status code. We use showstack to highlight the various exit points of forkexec and list. The fact that we can show these stacks at the time of exit highlights an important property of exit traps: the handlers have the ability to access the address space and registers of the terminating process. This access is indeed terminal: once the handler returns, the associated execution is torn down and the ctl becomes invalid.

Trapping signals

The signal trap kind is used to trap signal events.

The trapsig example demonstrates usage. It launches a modified version of list, called listbug, that introduces a pointer bug in the insert function. Expression of this bug causes a segmentation fault.

@define trapsig()
{
	@local mux, ctl;
	[mux,ctl,_] = start(["./listbug"]);

	@defloc handler(ctl, info)
	{
		@local ns;
		ns = ctl.dialect().ns;
		printf("ctl %d received signal %d\n",
		       ctl.id(), info->si_signo);
		if(info->si_signo == ns`SIGSEGV)
			printf("faulting address is 0x%p\n", info->si_addr);
		showstack(ctl);
	}

	ctl.trap('signal, handler);
	mux.run();
}

; trapsig();
ctl 17907 received signal 11
faulting address is 0x8
00000000004005e0	insert+0x7c                   	/u/vic/src/l1/demo/listbug.c:25
0000000000400616	dowork+0x2c                   	/u/vic/src/l1/demo/listbug.c:36
0000000000400664	main+0x3b                     	/u/vic/src/l1/demo/listbug.c:46
00002aaaaacebc4c	__libc_start_main+0xfc        
00000000004004a8	_start+0x28                   

;

The handler is passed a pointer to a structure of type Siginfo that stores information about the signal event. Like the Ctx structure, the dialect of the ctl determines the definition of Siginfo. On Unix platforms, the si_signo field gives the integer signal number. The meaning of the other fields of the Siginfo depend on the value of si_signo; see a Unix manual for details.

The dialect also includes an enum type named Signal that defines an enumeration constant for each valid signal number on the target execution system. The example uses the ns field of the dialect to obtain the name space that defines this Signal in order to compare the signal number to the enumeration constant name for the segmentation fault signal.

Using snapshots

Snapshots are obtained with the ctl snap method. The return value is a new ctl representing a snapshot of the state of the ctl. Resuming the original ctl has no effect on the snapshot; the snapshot remains valid for as long as there are references to it.

Here we collect snapshots of the state of a list execution at the entry to each call to insert. Then, when list exits, we use the snapshots to print the state at these call sites.

@define snaps()
{
	@local mux, ctl, exe;
	@local snaps, i;

	[mux,ctl,exe] = start(["./list"]);

	snaps = [];

	i = 0;
	@defloc oninsert(ctl)
	{
		append(snaps, ctl.snap());
		printf("took snap #%d\n", ++i);
	}

	@defloc onexit(ctl, status)
	{
		@local i;
		printf("list has exited\n");
		i = 0;
		foreach(@lambda(ctl){
				@local ctx, exe;
				ctx = ctl.ctx();
				exe = ctl.exe();
				printf("snap #%d: insert(0x%p, %d)\n",
				       ++i,
				       ctx->rdi, ctx->rsi);
				showlist((Node*){exe}ctx->rdi);
			}, snaps);
	}

	ctl.trap('brk, &exe`insert, oninsert);
	ctl.trap('exit, onexit);
	mux.run();
}

; snaps();
took snap #1
took snap #2
took snap #3
took snap #4
took snap #5
list has exited
snap #1: insert(0x0, 0)
	list =
snap #2: insert(0x602010, 1)
	list = 0
snap #3: insert(0x602010, 2)
	list = 0 1
snap #4: insert(0x602010, 3)
	list = 0 1 2
snap #5: insert(0x602010, 4)
	list = 0 1 2 3
;

Function and method reference

Name spaces served by sctl

The debug library supports the construction of standalone name spaces from program binaries, without depending on an associated execution. This functionality does not involve ctls or sctlmuxes.

This functionality uses the sctl protocol to access name space information from a remote sctl service. Abstractly, the client provides sctl the path name of a program binary, and sctl returns a name space; we say these name spaces are "served by sctl". In contrast to name spaces constructed with the @names syntax and "names files", a sctl server is free to use any approach to supply name space information, such as reading it directly from the binary, without requiring the Cinquecento programmer to know or understand the details.

Programs only interested in this name space functionality may prefer to @include <sctlns.cqct>; this avoids the overhead and potential top-level naming conflicts of loading the full debug library. Loading <debug.cqct> implicitly loads <sctlns.cqct>.

There are two high-level functions for obtaining name spaces served by sctl. Atnames launches a fresh, local sctl server each time it is called; snamesns contacts a remote snames server that must be started separately.

atnames(path)

path: string

name space or nil

Returns a name space for the binary named path. The name space is served by a freshly launched sctl server on the calling machine. The sctl binary must be in the path of the calling process. Returns nil if the name space does not exist.

snamesns(path)

path: string

name space or nil

Returns the name space associated with the name path on the snames server, or nil if the name space does not exist.

The address of the snames server contacted by snamesns is specified by setsnamesaddr. Stale connections can be restarted by calling snamesreconnect.

setsnamesaddr(addr)

addr: string

nil

Addr is a network address of the form "host:port", where host is an IP address or hostname and port is a port number or service name. Updates to addr the address of the snames server contacted by the library. The default is "localhost:30000".

snamesreconnect()

nil

Advises the library to reset its connection to the snames server.

The debug library implicitly calls atnames to construct name spaces for ctls. It can be forced to use a different constructor by re-defining atnames to a different function. For example, to force names spaces to be constructed using a snames server, evaluate this line before creating any ctls:

atnames = snamesns;

(There should be a better way to do this essential configuration task.)

Name spaces served by sctl are like ordinary Cinquecento name spaces, except that their interface is extended to support four additional methods. These methods provide access to call stack unwinding, source code location, and local variable information. Some of the name space filters in the Cinquecento library, including mknsoff, support this extended interface.

unwind1(this,pc)

this: invoking object

pc: cvalue

list of unwind rules or nil

Returns target-dependent stack frame unwinding rules corresponding to the program counter value pc. Each unwind rule is specified in an uwrulerec that is based on the encoding of the unwind rules specified by DWARF (see sctlunwind1). Returns nil if no unwind rules are defined.

looksrc(this,pc)

this: invoking object

pc: cvalue

source record or nil

Returns source code location information corresponding to the program counter value pc. See sctllooksrc for the definition of a source record. Returns nil if no source code location is defined for pc.

lookpc(this,file,line)

this: invoking object

file: string

line: cvalue

cvalue or nil

Returns a program counter address corresponding to the source location file:line. Returns nil if there is no matching source location.

enumloc(this,pc)

pc: cvalue

list of local variable records or nil

Returns a list of local variable records describing local variables in scope at program counter address pc. See sctlenumloc for the definition of a local variable record. Returns nil if no local variable information is available.

The sctl protocol transactions underlying atnames and snamesns are encapsulated in the function mksctlns. This function is intended to support the creation of new high-level interfaces like atnames or snamesns. A more low-level interface to the sctl protocol is described later.

mksctlns(fd,path)

fd: file descriptor

path: string

name space or nil

Returns a name space for the binary named path. The name space is served by the sctl server backed by fd. The version negotiation with the sctl server must be performed before calling this function (see sctlversion). Returns nil if the name space does not exist.

Sctlmux

Usage of the debug library is based on event-based programming model, managed by an instance of the sctlmux object. Associated with each sctlmux is a set of bound ctls under its control. Operations on bound ctls are mapped to transactions with the sctl that controls the target execution. The sctl in turn sends messages announcing events that have occurred in the target execution, which are dispatched to event handlers in the associated ctl. The sctlmux multiplexes these activities across multiple ctls.

mksctlmux()

sctlmux

Returns a new sctlmux backed by a sctl server freshly launched on the calling machine.

A sctlmux is a record that exports the following methods:

run()

nil

Resumes execution of all ctls bound to the sctlmux and enters the sctlmux event loop. The event loop runs until all ctls bound to the sctlmux, including those dynamically added as the loop runs, have exited. The event loop can be aborted by calling the abort method.

abort()

nil

Aborts the sctlmux event loop.

An execution is bound to a sctlmux in one of three ways: launching a new execution, attaching to an existing (unbound) execution, or trapping process or thread creation (fork and clone) events in an already bound execution. Here we describe the launch and attach mechanisms; fork and clone events are discussed in the ctl trap method.

ctllaunch(mux,cmd)

mux: sctlmux

cmd: list of string

ctl or nil

Launches a new program execution, returning a ctl representing the execution. Cmd is the list arguments used to run the program; i.e., on Unix-based targets these strings are passed to exec, with cmd[0] becoming argv[0], and so on.

There is currently no way to specify a different environment to the launched program, or to redirect its I/O.

ctlattach(mux,pid)

mux: sctlmux

pid: cvalue

ctl or nil

Attaches to an existing program execution, returning a ctl representing the execution. Pid is the process identifier of the execution.

Ctls

A ctl is an object that represents a thread of execution. Each ctl is a record that exports the following methods:

id()

cvalue

Returns a numeric identifier for the ctl.

exe()

domain

Returns a domain representing the executable of the ctl.

ctx()

context

Returns the current context of the ctl. The context is represented as a pointer into a domain, disjoint from the address space of the ctl, that contains a copy of the remote register set. The pointer points to a structure of type Ctx, which is defined in the dialect record returned by dialect method of the ctl.

setctx(ctx)

ctx: context

nil

Updates the current context of the ctl to ctx. The context should be a modified instance of a context returned by the ctx method.

dialect()

dialect record

Returns the dialect of the ctl. The definition of the dialect record includes at least the following fields:

@record sctldialectrec {
	name,
	ns,
	Ctx,
	Siginfo,
	Syscall,
	Signal,
	dumpctx
};

Name is a string specifying the name of the dialect. Ns is the name space in which all dialect types and enumerations are defined. Ctx is the structure type definition (ctype) for the context type, and Siginfo is the structure type definition (ctype) for the signal information type. Syscall is an enumeration type that defines names for each valid system call number, and Signal is an enumeration type that defines names for each valid signal number. Dumpctx is a function that prints the fields of a context.

cast(addr)

addr: cvalue

cvalue

Returns addr cast into a domain backed by the address space of the ctl. If addr seems to be associated with a particular name space (as determined by ns), addr is cast into a domain formed from that name space and the address space of the ctl. Otherwise addr is cast into a domain formed from the name space of the domain of addr and the address space of the ctl.

detach()

nil

Releases control of the target execution. The execution resumes, but future operations on the ctl are invalid.

It is currently unsafe to call this method from within a trap handler.

kill()

nil

Like detach, but terminates the target execution instead of resuming it.

It is currently unsafe to call this method from within a trap handler.

stat()

stat record

Returns a record describing status information about the remote target. This method is currently undefined and unimplemented.

mem()

address space

Returns an address space backed by the target process memory.

nsmap()

nsmap

Returns an nsmap describing the current mappings in the target address space.

ns([idx])

idx: cvalue or string

name space or nil

Queries the current mappings in the target address space (i.e., its nsmap), returning a name space corresponding to one of the mappings. If idx is unspecified, a name space corresponding to the executable is returned. If idx is a cvalue, it is treated as an address in the target memory, and the name space "containing" that address is returned (as in byaddr). If idx is a string, it is treated as a string pattern; the name space whose path name matches the pattern is returned (as in byname). If no name space matches, nil is returned.

dom([idx])

idx: cvalue or string

domain or nil

The dom method is like the ns method, except that a freshly constructed domain is returned instead of a name space. The name space is the name space returned by the ns method on argument idx, and the address space is the address returned by the mem method (i.e., the current address space of the target).

snap()

ctl

Returns a ctl representing a snapshot of the target process. Regardless of when the original ctl terminates, the snapshot remains valid until there are no remaining references to it.

ldom([ctx])

ctx: context

domain or nil

Returns a local domain representing the local variables (including function parameters) defined in the context ctx of the target process. If ctx is unspecified, the current context is used. The name space of the local domain includes definitions for all known local variables. Some inadvertent ability to update the values of local variables may exist, but this is currently undefined. Returns nil if a local domain cannot be constructed for ctx.

unwind()

list of contexts

Unwinds the current call stack of the ctl, returning a list of contexts corresponding to each call frame. The current call frame appears first in the list, followed by its caller, and so on.

looksrc(addr)

addr: cvalue

source record

Returns a source record specifying the location in source code corresponding to addr.

lookpc(file,line[,idx])

file: string

line: cvalue

idx: cvalue or string

cvalue

Returns an address (program counter value) corresponding to the source location file:line in the name space determined by passing idx to the ns method.

trap(kind[,arg],fn)

kind: cid

arg: varies

fn: procedure

cvalue

Sets a trap on an event of various forms, specified by the kind and event-specific additional argument arg. Fn is a handler to be called in reponse to the event. Multiple handlers may be registered for the same event; they are called in registration order. Regardless of the kind of event, the first argument passed to the handler is always the ctl representing the target on which the event occured. Additional arguments may be passed depending on the kind of event. The return value of each call to a handler is ignored.

The return value of trap is an unique identifier for the trap. Passing the identifier to a call to clear deregisters the trap, preserving the order of any remaining traps.

The available traps are listed below. Where required, the additional argument to trap is named following the symbol that identifies trap kind.

brk addr snap addr: Set a breakpoint or snappoint. The handler is called when the program counter reaches addr (a cvalue). Snap differs from brk in that a snapshot of the trapping ctl is created and passed to the handler, instead of the ctl for the trapping ctl. On Unix systems, brk and snap traps are automatically cleared whenever the target successfully returns from an exec system call.
exit: Set a trap on termination. The handler is called when the target has terminated but before the execution context has been reclaimed by the host system. No further execution will occur; the ctl becomes invalid when the handler returns.
load unload: Set a trap on library load or unload. The handler is called when the target has updated its address space by loading or unloading a library. Along with the ctl, the handler is passed two additional arguments: the path name and load address of the affected library. The nsmap of the ctl is updated prior to calling the handler.
exec: Set a trap on new program execution. The handler is called when the target has successfully returned from the exec system call, just before the execution of the first instruction in the new program image.
syscall: Set a trap on system call entry and exit. The handler is called whenever the target enters or returns from a system call. It is up to the handler to figure out which.
signal: Set a trap on signals. The handler is called whenever a signal is about to be posted to the target. Along with the ctl, the handler is passed a pointer to a Siginfo structure describing the signal event.
clone: Set a trap on process creation. The handler is called when the target has spawned a new process. Along with the ctl that spawned the new process, the handler is passed a new ctl representing the new process.

clear(id)

nil

Clears the trap with identifier id.

traps()

list of trap records

Traps returns a list of trap records describing all currently registered traps on the ctl.

@record traprec {
	tid,      /* user-visible trap identifier */
	sid,	  /* sctl trap identifier (not for user) */
	type,	  /* type of trap */
	args,	  /* trap configuration arguments */
	fn	  /* trap handler */
};

The system registers traps for its own internal state management on each ctl. These traps are revealed by traps, but they should not be tampered with.

Trap handler tools

This section describes a collection of tools for implementing trap handlers.

The following three functions simplify the specification of trap handlers for function call and return events. They make use of name space type information to make function parameters and return values directly available to handlers. They also encapsulate a mechanism for accessing argument and return values. (Currently they assume the AMD64 System V ABI calling convention.)

brk(ctl,addr,handler)

ctl: ctl

addr: addr

handler: procedure

cvalue

Sets a breakpoint trap on the entry of the function at address addr in the execution associated with ctl. The type of addr should be the actual type of the trapped function. When the trap fires, handler is called. Handler must be prepared to accept several arguments, depending on the type of the trapped function. The first argument is the ctl on which the trap occurred. The remaining arguments are in one-to-one correspondence with the parameters declared in the type of the trapped function. The arguments are read-only copies of the arguments passed to the trapped call, cast into the target domain. Brk returns the trap identifier.

rbrk(ctl,addr,handler)

ctl: ctl

addr: addr

handler: procedure

cvalue

Sets a breakpoint trap on the return from calls to the function at address addr in the execution associated with ctl. The type of addr should be the actual type of the trapped function. When the trap fires, handler is called. Handler must be prepared to accept one or two arguments, depending on the type of the trapped function. The first argument is the ctl on which the trap occurred. The second argument is a read-only copy of the return value of the function, cast into the target domain. There is no second argument if the function has no return value. The return value of handler is ignored. Rbrk returns the trap identifier.

brbrk(ctl,addr,handler)

ctl: ctl

addr: addr

handler: procedure

cvalue

This function combines the functionality of brk and rbrk, allowing traps on both the call and return from the function at address addr in the execution associated with ctl. As with brk and rbrk, the type of addr should be the actual type of the trapped function. When a call to the function occurs, handler is called in the same manner as in brk, with one difference: handler may return a handler procedure to be called when the current function call returns. This handler is called in the same manner as the handler in rbrk. No return trap is set if handler returns nil. Brbrk returns the trap identifier associated with the call trap.

The following functions format commonly printed execution state.

fmtldom(ctl,ctx)

ctl: ctl

ctx: context

string

Returns a multi-line string displaying the location, type, and name of each local variable defined at context ctx in ctl.

fmtframe(ctl,ctx)

ctl: ctl

ctx: context

string

Returns a string displaying the program counter, function, and source line information for the function call site associated with context ctx in ctl.

dumpstacklocs(ctl)

ctl: ctl

nil

Prints to stdout the current call stack of ctl, including source code location and local variable information.

Low-level sctl

For low-level name space programming, the debug library provides a set of Cinquecento functions for performing synchronous name space transactions with a sctl server.

The first argument to each of these functions is a file descriptor that must be connected to a sctl server. Generally the remaining arguments are Cinquecento data to be encoded into the fields of the corresponding sctl request. Most functions return a data structure representing the decoded payload of the corresponding reply.

These procedures assume that the sctl server answers each request synchronously and that it never generates Aevent messages. This behavior is consistent with a sctl server that is only being used to serve name space data.

Two of the procedures, sctlenumtype and sctllooktype, require an additional name space argument. This name space is used by the underlying type definition decoder (decodetdef) to resolve two aspects of type definitions that cannot be determined from the sctl type definition reply:

the mapping from enumeration representation to a corresponding base type, passed to mkctype_enum;
the representation of the pointer type, passed to mkctype_ptr.

For this argument, it is sufficient to pass a root name space that is compatible with the base type definitions of the target name space. We consider the presence of this argument to be a bug in the interface.

The functions are as follows:

sctlping(fd,cnt)

fd: file descriptor

cnt: cvalue

nil

Performs a ping transaction with the server, sending a ping payload of cnt zero bytes. An error is raised if the transaction fails.

sctlversion(fd)

fd: file descriptor

nil

Performs a version transaction with the server. The offered version is

"sctl-2012:x86-linux-2012,x86-win-2012"

The negotiated version string is returned. An error is raised if the version cannot be negotiated.

sctlnames(fd,path)

fd: file descriptor

path: string

cvalue or nil

Performs a names transaction, returning a name space identifier for the name space corresponding to the binary named path. Returns nil if the name space does not exist.

sctllooktype(fd,nsid,typename,ns)

fd: file descriptor

nsid: cvalue

typename: ctype

ns: name space

ctype or nil

Performs a looktype transaction on the name space corresponding to nsid, returning an definition of the type named typename. The resulting definition is incomplete. Returns nil if there is no matching type definition.

sctlenumtype(fd,nsid,ns)

fd: file descriptor

nsid: cvalue

ns: name space

table

Performs an enumtype transaction on the name space corresponding to nsid, returning a table mapping type names to type definitions. The type definitions are incomplete.

sctllooksym(fd,nsid,id)

fd: file descriptor

nsid: cvalue

id: cid

symbol or nil

Performs a looksym transaction on the name space corresponding to nsid, obtaining a definition for the symbol id. The result is a Cinquecento symbol [reference]. Its attribute table defines "offset" to the value returned in the looksym reply, "flags" to the flags, and "size" to the size. The symbol type is incomplete. Returns nil if there is no matching symbol.

sctlenumsym(fd,nsid)

fd: file descriptor

nsid: cvalue

table

Performs an enumsym transaction on the name space corresponding to nsid, returning a table mapping symbol identifiers to symbol definitions. The symbol definitions are Cinquecento symbols.

sctllookaddr(fd,nsid,addr)

fd: file descriptor

nsid: cvalue

addr: cvalue

symbol

Performs a lookaddr transaction on the name space corresponding to nsid, returning a symbol definition to which addr maps. The symbol definition is a Cinquecento symbol. Returns nil if there is no matching symbol.

sctlunwind1(fd,nsid,addr)

fd: file descriptor

nsid: cvalue

addr: cvalue

list of unwind rules

Performs an unwind1 transaction on the name space corresponding to nsid, returning target-dependent stack frame unwinding rules corresponding to the program counter value addr. Each unwind rule is specified in an uwrulerec that is based on the encoding of the unwind rules specified by DWARF:

@record uwrulerec {
	kind,		/* rule kind */
	r,		/* register operand */
	n		/* number operand */
};

Returns nil if no unwind rules are defined.

sctllooksrc(fd,nsid,addr)

fd: file descriptor

nsid: cvalue

addr: cvalue

source record

Performs a looksrc transaction on the name space corresponding to nsid, returning a source record corresponding to the program counter addr. The source record is defined as follows:

@record srcrec {
	file,		/* file name (string) */
	line,		/* line (cvalue) */
	col,		/* column (cvalue) */
};

Returns nil if no source location matches.

sctllookpc(fd,nsid,file,line)

fd: file descriptor

nsid: cvalue

file: string

line: cvalue

cvalue

Performs a lookpc transaction on the name space corresponding to nsid, returning a program counter address corresponding to the source location file:line. Returns nil if there is no matching source location.

sctlenumloc(fd,nsid,addr)

fd: file descriptor

nsid: cvalue

addr: cvalue

list of local variable records

Performs an enumloc transaction on the name space corresponding to nsid, returning a list of local variable records. Local variable records are defined as follows:

@record locrec {
	id,		/* name (cid) */
	sz,		/* size (cvalue) (FIXME: always zero!) */
	ltype,		/* kind (parameter or local) (cvalue) */
	loc,		/* location expression (lexprrec) */
	type,		/* type name (ctype) */
};

@record lexprrec {
	kind,		/* kind (cvalue) */
	no,		/* register (cvalue) */
	v,		/* value (cvalue) */
	op1, op2,	/* operands (lexprrec) */
};

Returns nil if no location information is available.

This library also defines several undocumented functions for converting buffers of encoded sctl data to and from Cinquecento data structures.

Name space maps

A name space map (nsmap) is an object that manages name spaces for the set of binaries (libraries and executable) mapped within an address space of a running program.

Think of each mapping as a triple comprising the path name for the mapped binary, the base address of the mapping, and the corresponding Cinquecento name space.

A typical address space consists of several library mappings and exactly one executable mapping. An nsmap can define a distinguished mapping, called the executable, which is intended to represent the executable mapping.

The methods of an nsmap support dynamic addition and removal of mappings, query and update of the executable mapping, name space lookup by address and name, name space enumeration, and nsmap cloning.

Rather than require the nsmap client to construct individual name spaces, each nsmap instance encapsulates a function that maps a path name to a name space. Clients add mappings by specifying the binary path name and base address.

To use nsmaps outside of the debug library, @include <nsmap.cqct> in your program. The library defines one function:

mknsmap(fn)

fn: procedure

nsmap

This is the only constructor for nsmap instances. Fn is a function that encapsulates name space construction. It must accept one argument, a string naming a program binary, and return either a name space for that binary mapped at address zero, or nil if the name space cannot be constructed.

Each nsmap is a record that exports the following methods:

add(base,path)

base: cvalue

path: string

nil

This method constructs a name space for the binary named path mapped at offset base, and then adds the resulting mapping to the nsmap. The name space is constructed by calling the fn argument that was passed to mknsmap when the nsmap was created.

del(base)

base: cvalue

nil

This method removes the name space mapped at offset base from the nsmap. If base corresponds to the executable, then the executable is made undefined.

exe()

name space or nil

This method returns the name space of the executable of the nsmap, or nil if it is undefined.

setexe(base)

base: cvalue

nil

This method sets the executable of the nsmap to be the mapping at offset base. It is an error if there is no such mapping.

byaddr(addr)

addr: cvalue

name space or nil

Intuitively, this method attempts to return the name space in the nsmap whose mapping "contains" the address addr. More precisely, this method returns the name space whose offset is the greatest address less than or equal to addr. It returns nil if no name space matches.

byname(pat)

pat: string

name space or nil

This method returns the name space whose path name matches pat. A path name path matches if strstr(path,pat) returns non-nil. The search is performed in ascending order of mapping offset; the first match is returned. Byaddr returns nil if no name space matches.

nsbase(ns)

ns: name space

cvalue or nil

This method returns the base address associated with the name space ns in the nsmap. It returns nil if no mapping matches ns.

nspath(ns)

ns: name space

string or nil

This method returns the path associated with the name space ns in the nsmap. It returns nil if no mapping matches ns.

each(fn)

fn: procedure

nil

This method calls fn for each mapping in the nsmap, in ascending order of mapping offset. Fn is passed the three properties of the mapping: the offset, the path name, and the name space. The return value of fn is ignored.

copy()

nsmap

This method returns a new nsmap that is a copy of the nsmap. Subsequent updates to either one of the nsmaps has no effect on the other.

Function and method index

Methods are listed with the name of the defining object type in parentheses.

abort (sctlmux)
add (nsmap)
atnames
brk
brbrk
byaddr (nsmap)
byname (nsmap)
cast (ctl)
clear (nsmap)
copy (nsmap)
ctlattach
ctllaunch
ctx (ctl)
del (nsmap)
detach (ctl)
dialect (ctl)
dom (ctl)
dumpstacklocs
each (nsmap)
enumloc (ns)
exe (ctl)
exe (nsmap)
fmtframe
fmtldom
id (ctl)
kill (ctl)
ldom (ctl)
lookpc (ctl)
lookpc (ns)
looksrc (ctl)
looksrc (ns)
mem (ctl)
mknsmap
mksctlns
ns (ctl)
nsbase (nsmap)
nsmap (ctl)
nspath (nsmap)
rbrk
run (sctlmux)
setctx (nsmap)
setexe (nsmap)
sctlenumloc
sctlenumsym
sctlenumtype
sctllookaddr
sctllookpc
sctllooksrc
sctllooksym
sctllooktype
sctlnames
sctlping
sctlunwind1
sctlversion
setsnamesaddr
snamesns
snamesreconnect
snap (ctl)
stat (ctl)
trap (ctl)
traps (ctl)
unwind (ctl)
unwind1 (ns)