ApportImprovements

Differences between revisions 8 and 11 (spanning 3 versions)
Revision 8 as of 2006-11-29 11:21:15
Size: 4892
Editor: 87
Comment: Mono's default exception handler is managed code, drop global variable there
Revision 11 as of 2007-02-28 13:05:00
Size: 4337
Editor: 87
Comment: add test cases
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
 * '''Launchpad entry''': https://blueprints.launchpad.net/distros/ubuntu/+spec/apport-improvements
 * '''Packages affected''': apport, linux-source-2.6.19, python2.4, python2.5, mono
 * '''Launchpad entry''': UbuntuSpec:apport-improvements
 * '''Packages affected''': apport, linux-source-2.6.19, python2.4, python2.5
Line 21: Line 21:
 * F-Spot crashes. apport creates a report which contains the Mono backtrace.
Line 26: Line 25:
 * Add hooks to most common interpreters (Python/Mono) to intercept unhandled exceptions and create an apport report.  * Add hooks to most common interpreters (Python for now) to intercept unhandled exceptions and create an apport report.
Line 34: Line 33:
 * 2.6.19 upstream does not support arguments for the called program. In order to avoid having to write and process the core dump, we want to use the `%p` and `%s` macros. Thus the kernel should first split `core_pattern` value at spaces, consider the first field as program path, the rest as arguments, and, after splitting, perform macro substitution (this will work correctly with %e containing spaces). Andi Kleen would welcome to see this fixed upstream, but doesn't want to work on that himself.  * 2.6.19 upstream does not support arguments for the called program. In order to avoid having to write and process the core dump, we will pass the macro values as environment variables. Andi Kleen would welcome to see this fixed upstream, but doesn't want to work on that himself.
Line 48: Line 47:
=== Mono ===

There are two cases of crashes here: (1) due to an unhandled exception, and (2) due to a crash-related signal (SIGSEGV/SIGBUS/etc.). In both cases, Mono prints out a backtrace by default, which we want to capture for the apport report.

==== (1) Unhandled exception ====

After printing the stacktrace to stderr (current default behaviour), Mono's default exception handler additionally calls `/usr/share/apport/mono-hook` ''pid'' and pipes the stacktrace to its stdin. `mono-hook` will collect the usual generic data from `/proc` and create a proper crash report in `/var/crash/`.

==== (2) Crash due to signal reception ====

Mono's current signal handler already causes the Mono backtrace to be written to stderr. This needs to be changed to:

 0. write the backtrace to a global variable
 0. print out that variable to stderr to remain compatible with previous behaviour

Then apport's `report_add_gdb_info()` can fish out the value of that global variable from the core dump and add it to the report.
Line 69: Line 51:
= Test cases =

 * Crash of a process that cannot write into its cwd:
 
 {{{
 $ rm /var/crash/*
 $ /usr/lib/notification-daemon/notification-daemon &
 $ kill -SEGV %1
 $ ls /var/crash
 _usr_lib_notification-daemon_notification-daemon.1000.crash
 }}}

 and you'll get an apport window that reports the crash.

 * Catching Python crashes:
  
 {{{
 $ rm /var/crash/*
 $ sudo sed -i '2 s/import/imnport/' /usr/bin/serpentine
 $ /usr/bin/serpentine
 [...]
 $ ls /var/crash/
 _usr_bin_serpentine.1000.crash
 $ sudo sed -i '2 s/imnport/import/' /usr/bin/serpentine
 }}}
 
 and again you will get an apport crash notification.

 Both of these cases (as well as many others) are already checked in apport's own test suite. The self tests can be run with
 
 {{{
 $ /usr/share/apport/testsuite/run-tests }}}

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

  • Launchpad entry: apport-improvements

  • Packages affected: apport, linux-source-2.6.19, python2.4, python2.5

Summary

We want to extend the range of crashes that apport can process, as well as make the process more efficient.

Rationale

In Edgy we do not get any useful information from crashed processes which cannot write into their cwd (like notification-daemon) or are terminated with SIGABRT due to an exception (like interpreted Python/Mono programs). To accelerate processing, the core dump should not be temporarily written to the disk at all.

Use cases

  • notification-daemon, a user-session daemon which does a chmod('/') at program start, crashes. apport is able to get a proper core dump and extract useful information from it.

  • serpentine crashes with an unhandled Python exception. apport picks this up and creates a report which contains the Python backtrace.

Design

  • Have the kernel pipe the core dump to apport instead of writing it to the disk temporarily. This is both faster and also makes core dump creation independent from cwd writability.
  • Add hooks to most common interpreters (Python for now) to intercept unhandled exceptions and create an apport report.

Implementation

Kernel

In 2.6.19, Andi Kleen committed a new feature for /proc/sys/kernel/core_pattern: it can start with a pipe ('|') now, in which case the remainder is interpreted as a path. That path is executed, and the core dump is piped to stdin. We will base our solution on this, however, two modifications are still required:

  • 2.6.19 upstream does not support arguments for the called program. In order to avoid having to write and process the core dump, we will pass the macro values as environment variables. Andi Kleen would welcome to see this fixed upstream, but doesn't want to work on that himself.
  • We do not want to generally enable core dumps, thus we need to leave the default ulimit -c to 0. If core_pattern is a pipe, the kernel should ignore the current ulimit -c; our kernel maintainers consider this safe, since the kernel does not actually write any file in that case. Instead, the called process can decide about an appropriate limit. This should be discussed with upstream and we should aim for their approval, so that eventually apport (and similar crash interception projects) work on a stock upstream kernel across all distributions.

apport

Apport needs to read the core dump from stdin if the core dump path is '-' (this is already implemented in Feisty). The init script needs to set the appropriate core_pattern: |/usr/share/apport/apport %p %s -.

We will also add a new Python package apport (shipped in python-apport deb) which provides a default Python exception handler that creates an apport report (this is already implemented in Feisty).

Python 2.4/2.5

site.py should try to import apport.python_hook, and if that succeeds, call the apport exception handler (python_hook.install()). See https://launchpad.net/bugs/70957 for details and patch.

Data preservation and migration

Not required.

Test cases

  • Crash of a process that cannot write into its cwd:
     $ rm /var/crash/*
     $ /usr/lib/notification-daemon/notification-daemon &
     $ kill -SEGV %1
     $ ls /var/crash
     _usr_lib_notification-daemon_notification-daemon.1000.crash
    and you'll get an apport window that reports the crash.
  • Catching Python crashes:
     $ rm /var/crash/*
     $ sudo sed -i '2 s/import/imnport/' /usr/bin/serpentine
     $ /usr/bin/serpentine
     [...]
     $ ls /var/crash/
     _usr_bin_serpentine.1000.crash
     $ sudo sed -i '2 s/imnport/import/' /usr/bin/serpentine
    and again you will get an apport crash notification. Both of these cases (as well as many others) are already checked in apport's own test suite. The self tests can be run with
     $ /usr/share/apport/testsuite/run-tests 


CategorySpec

ApportImprovements (last edited 2008-08-06 16:30:26 by localhost)