You are here

Adaptative fault probing

Static probe sites may be inserted at compilation time and remain dormant until activated at runtime, typically to generate tracing information upon need. Dynamic probes can be added at runtime, to adapt the system behavior, for example to trace various parts of the operating system for diagnosing a problem. DTrace is probably the best known recent implementation of static probe sites and dynamic probes. SystemTap offers a similar functionality for dynamic probes under Linux. The most important characteristics of static probe sites and dynamic probes is their ability to be inserted anywhere (including in interrupt and even non-maskable interrupt context), their low overhead (minimal performance hit when dormant and when activated) and their low disturbance (do not change the real-time behavior).

The group under the supervision of Michel Dagenais is working on static probe sites, providing an initial implementation for the mainline Linux kernel, LTTng kernel tracer, and subsequently for user-space, LTTng user-space tracer, used by thousands of developpers and installed on millions of computers around the world. The challenge is to simultaneously minimize the number of execution cycles required for a probe, including the effect of the added instructions on the memory cache, while not interfering with the real-time response of the system. Furthermore, probe activation should be possible even when the program may be simultaneously accessed by several processors on a multi-core system. To achieve this, atomic operations local to a CPU, per CPU buffers and data structures, and special techniques for code patching online multi-threaded binary executable code may be used.

In addition, experimentation is under way to look at how to combine dynamic and static probes in a common framework and alternative methods for inserting dynamic probes. The Linux kernel kprobe and the GDB normal tracepoints are based on costly trap instructions. Newer techniques used in fast GDB tracepoints, and in optimized kprobe, rely on jump patching and other interesting optimisations for generating the probe handler code.