ApportRetraces

information_little.png This page is part of the Bug Squad’s KnowledgeBase - pages with information about how to triage bugs.

This page is based on the MOTU/School/IntrepretingApportRetraces session presented by EmmetHikory on 26/6/2008.

What is Apport

Apport is a useful tool installed on Ubuntu machines that collects information when applications crash, and automatically prompts the user to submit a crash report as a bug to Launchpad. This document is a description of how to interpret, triage, and work with such bug reports.

When a program crashes, apport notices, and prepares a crash report. When so enabled, it prompts the user to submit this crash report to launchpad as a new bug.

Apport-created Bug Reports

Apport bug reports typically explain the type of crash in the title, have some information in the description, along with basic information about the package version, installed [Ubuntu] version, and user environment. This is followed by a single comment with a number of attachments, representing further information Apport collected from the user system.

A crash bug report will automatically be tagged with need-architecture-retrace (ex: need-amd64-retrace), and is private (only accessible to reporter and apport). The apport-retracer system will automatically review those bugs, possibly determine it to be duplicate, and help with additional information (e.g. retraced stacktrace with symbols for stripped binaries), and afterwards remove the retracing tag and make the bug accessible for the Ubuntu bug triaging teams.

If the retracer fails to create an usable stack trace, and discovered that the user has some obsolete package versions installed, it will mark the bug as invalid with an appropriate explanation.

Apport attached files

Some commonly-seen files attached to apport-generated bug reports are:

  • Dependencies.txt

  • ProcMaps.txt

  • ProcStatus.txt

  • Traceback.txt

The above set of four files is common for Python-crash bugs. Some other reports may also include:

  • CoreDump.gz

  • ProcEnviron.txt

  • Registers.txt

  • Stacktrace.txt

  • ThreadStacktrace.txt

Dependencies.txt

Dependencies.txt shows the entire tree of recursive dependencies for the crashing package. This can be useful to determine if someone has upgraded to the latest version, or if the bug appears to also be in a dependency, which version of the code one should inspect.

ProcMaps.txt

ProcMaps.txt shows the local address space for associated objects (libraries, data) used by the application. It may provide insight in cases of library symbol contention and the like.

ProcStatus.txt

ProcStatus.txt provides information about the process itself, including the pid access permissions, memory allocations, and permissions/capabilities. One useful thing to check from this file is to make sure the memory allocations (Vm* are not overly large (100s of GB), which may indicate a runaway memory leak.

Traceback.txt

Traceback.txt can be very useful indeed. It explains the call stack at the time of the crash, and will guide the investigation of the code towards determining the problem exactly.

CoreDump.gz

CoreDump.gz is the snapshot of the process memory when it crashed. Most of the time, you won't need to use this.

ProcEnviron.txt

ProcEnviron.txt provides some of the environment variables that are set

Registers.txt

Registers.txt contains the state of the processor registers at the time of the crash (which is typically only meaningful for very low-level crashes).

Stacktrace.txt

Stacktrace.txt is similar to Traceback.txt, except that it's a C-style stacktrace, rather than a python-style traceback.

ThreadStacktrace.txt

ThreadStacktrace.txt is a stacktrace of all the currently running threads in the application, which can be interesting if there is thread contention.

Apport and Bug Triage

When triaging an Apport crash bug, it's important to make sure it's complete. If there isn't a trace, or it is unreadable, it may need to be retraced, or if the bug is old, it may be worth asking the submitter to submit an updated crash report.

For triage, there is often mostly sufficient information provided by Apport itself. If the information in the report complete (no ?? in the traces, see bug #239842 for an example of this issue), one can likely understand the issue.

It is good practice to put a short summary of the issue in a comment before setting "Triaged", explaining where or why it crashed, as a result of the code investigation. Often, the code investigation is sufficient to produce a small patch, which makes a welcome attachment.

Analysis of an Apport-created bug

As an example, consider Bug #180363 for package nicotine.

The first thing to note is that the version of nicotine has changed between the time the bug was reported, and our current source. This means that the line numbers will not be reliable. This is very frequently true when investigating Apport crashes.

Often, the best solution is to search the file for the relevant section. So, when reviewing a traceback, start at the top, and process downwards. Each step represents another layer of nested function, and leads to the crash.

In this example traceback, the first error is in /usr/bin/nicotine. Looking at the source, there is a Python script at the top level also named nicotine. Generally, this will be the same file. In those cases where the file being sought is not immediately available, a construction like

find . -$(filename) -print

is useful to locate the file in the source code.

The traceback says line 152, which is supposed to be "result = checkenv()".

In the newer source, line 152 is "import locale", which is clearly not the issue. Scrolling down from here a bit, one can find "result = checkenv()" at line 191.

Although sometimes source will have multiple lines with roughly the same content, this is the very likely to be correct line because it is still in the root source, rather than being in a specific function (and the traceback reports "in <module>"

At this point, it would be useful to get some idea as to the expected result. Scrolling down a bit first it becomes clear that no result will continue the program, and any result will print the result and exit.

Next, look for the definition of checkenv() This is up at line 49, and now a gettext call in checkenv() must be found. gettext is exceedingly commonly used, with the shorthand _(), which makes it a little hard to read the message.

One approach is to guess that there are about 40 extra lines added, given the difference between lines 152 and 191 from our previous search, so look for something somewhere between lines 90 and 130 (although this is inexact).

As an additional check, look further down the traceback. While sometimes the error is in the library (gettext in this example), first search the code.

Reading the traceback, it calls gettext, which calls dgettext, which calls translation, which calls _init_ , which calls _parse, which crashes. The final error message tells us that the list index is out of range.

This looks like an attempt to translate a plural word that couldn't find the right information in the translation file. Now, reading 40 lines of code to determine which might have a poor plural translation would be an exercise in frustration. Luckily, launchpad keeps a copy of every source package ever uploaded, so one can look back to see which line is really wanted. Going back to the main bug page, and clicking "nicotine", takes one to the nicotine package page. On this page, scroll down through the versions looking for 1.8.2+dfsg-1ubuntu1. Clicking on this headline leads to the summary page for this version, where we can examine it.

In this case, line 90 is close to the string "You do not have Python Vorbis bindings installed. Others will not be able to see the lengths and the bitrates> of Ogg Vorbis files that you share. You can get the from http://www.andrewchatham.com/pyogg/. If you're using Debian, install the python-pyvorbis package."

Returning to the current example source, that string starts at line 111

The issue is with the Hungarian translation, because the example bug page says "LANG=hu_HU.UTF-8". The nicotine source package has a languages/hu directory, containing nicotine.po, which holds the translations. Here, clearly there is a translation available, from around line 4469

Without a good grasp of Hungarian, about all one can do is to trust that the translation is likely correct (although it may not be). However, in order to check for the crash, it is clear that testing must be done with a Hungarian locale, and tried with the python-pyvorbis package uninstalled.

Uninstalling python-pyvorbis, and executing

LANG=hu_HU.UTF-8 nicotine

is, therefore, a useful way to try to reproduce the crash. If it breaks, the problem can now be described in detail (or, with skills in any of Hungarian, Python, or gettext maybe track it down). Conversely, if it works properly, that fact can be documented, and the submitter can be asked to verify it works for them, suggesting to install the python-pyvorbis package as a workaround.

This example Apport bug report would greatly benefit from someone adding a comment about the cause of the problem, and testing to see if it can be replicated locally. If anyone knows Hungarian, Python, or gettext, they may be able to provide more insight.

Triaging crash dump bug reports not processed by Apport

Sometimes bugs are submitted with a single crash report file attached, not processed by Apport. In general, it is best to close these as invalid, asking the submitter to open a new bug by either double clicking on the crash file in Nautilus or by executing

apport-cli -c "/var/crash/..."

so that apport can do it's thing.

It is important to explain that you are only closing it as the above procedure will open a new bug report, you are not rejecting their bug.

In cases where the retracer cannot find the symbols (?? in the trace), it usually means that no ddeb exists for the version the submitter is using. In these cases, verify that there are ddebs for all dependencies available for the release the submitter is using, and then ask them to replicate, opening a new bug.

Failed Apport retraces

Apport can get confused if the package doesn't call dh_strip somewhere in debian/rules: manual stripping or failure to strip binaries are the most common cause of failed retraces.


CategoryBugSquad

Bugs/ApportRetraces (last edited 2012-01-18 16:15:52 by petermatulis)