AutomatedCrashReporting

Automated crash reporting

Scope

The idea is to automatically generate a bug report when a (packaged) application crashes. A few applications (Mozilla, OpenOffice) already do this, but it would improve QA and help debugging if this could easily be extended to arbitrary applications.

Gnome has bug-buddy, but this program is way too complicated for end-user interaction and is limited to Gnome applications; thus we need a simple frontend, preferably written in python-gtk. Also, currently we cannot extract a sensible stack trace from a crash since our binaries are automatically stripped.

Components

  • HOOK: Hook to applications to intercept signals and calls INFO (in C)
  • INFO: frontend program that collects data and asks the user for additional information and how to proceed
  • EMIT: sends data to QA host (in Python)
  • DB: Database with stored reports, and a web frontend for processing them
  • Malone integration

HOOK

  • LD_PRELOAD library
    • least intrusive
    • no recompilation of packages necessary
    • LD_PRELOAD might be deleted in user's environment or its use might be disabled completely for security reasons -> would not work every time

    • applies to _every_ executable, also to user developed ones (which we don't want reports for); however, at crash time these can be sorted out easily
  • patch libc6
    • no recompilation of packages necessary
    • works every time
    • not portable to any other system; if package is accepted in Debian, fine, otherwise is has to be kept in sync
    • applies to _every_ executable, also to user developed ones (which we don't want reports for)
  • provide shared library, link it to application
    • most robust
    • allows individual choice which packages to survey
    • recompilation of all packages necessary

When HOOK intercepts a critical signal (SIGSEGV, SIGILL, or SIGFPE), it calls INFO and waits until it is finished. It terminates or continues the application according to INFO's return value.

INFO

Collects data:

  • signal name
  • stack trace
  • infos from /proc/$PID/: cmdline, environ, maps, status
  • package name and version
  • debconf settings?

Informs the user that the application has crashed unexpectedly and asks:

  • what he did immediately before the crash to help reproducing the bug
  • whether and how to send the data (SMTP, HTTP, output to file and manual processing)
  • how to proceed (continue or terminate application) -> return value

When DISPLAY is set, this should be a graphical application (Python GTK), otherwise it is a simple command line dialog.

Formats the report and sends it to EMIT (together with the send method).

EMIT

  • one backend for each send method (HTTP, SMTP, file, ...)

Debug symbols

  • Debug symbols should not be shipped in the standard debs since they have a considerable size.
  • gdb offers the "symbol-file" command which allows to load an external symbol file even after a crash.
  • This symbol file should be generated on the buildd (hook into dh_strip) and made available on our server, where it can be downloaded from the agent.

  • Downloaded symbols should be cached locally; this also allows preseeding
  • The existing prototype is already able to extract the package version of a crashed executable, so we can download the right version of debug symbols.

Proof of concept:

$ gcc -g -Wall -W -O2 crash.c -o crash
$ objcopy --only-keep-debug crash crash.dbg
$ strip crash
$ gdb ./crash
(gdb) run
Program received signal SIGSEGV, Segmentation fault.
0x080483a3 in ?? ()
(gdb) bt
#0  0x080483a3 in ?? ()
#1  0xbffffa18 in ?? ()
#2  0x080483da in ?? ()
[...]
(gdb) symbol-file crash.dbg
Load new symbol table from "crash.dbg"? (y or n) y
Reading symbols from crash.dbg...done.
(gdb) bt
#0  g (x=1, y=1) at crash.c:13
#1  0x080483da in f (x=1) at crash.c:20
#2  0x080483f5 in main () at crash.c:28

Prototypes


CategoryUdu CategorySpec

UbuntuDownUnder/BOFs/AutomatedCrashReporting (last edited 2008-08-06 16:38:09 by localhost)