The Server Bug Triage Process
This topic aims to help bug triagers follow a consistent process when triaging bugs in the Ubuntu Server packageset.
What is Bug Triage? its the process of assessing new bugs resulting in one of three states:
- Triaged/Decided - there is enought information to reproduce and decide the importance of the bug.
- Invalid - the bug is not a bug - its a configuration, local environment or other unrelated error.
- Expired - insufficient information was provided to confirm the bug within 60 days of requesting said information.
This ensures that bugs are worked on in priority order with sufficient information to be able to fix the original bug report.
What Bug Triage is not:
- Fixing bugs: this happens after triage
- A support function - if people need help they should use #irc, ubuntu-server ML, askubuntu.com or LP answers.
This topic should be read in-conjunction with the rich information provided by the Ubuntu Bug Squad about bug triage - you can find it here:
In order to drive the Triage process the server team break it down into two distinct phases.
Phase 1: Assess the importance of the bug
The objective of the first phase of triage is to gather sufficient information to determine:
- Is the reported bug actually a bug.
- What the overall importance of the bug is.
Determining the importance of a bug is important as it allows the team to prioritize effort in the following phase during the triage process.
Bugs with the following status combinations fall into this phase of triage:
- Incomplete/Undecided (with response)
- Confirmed/Undecided (although this state should only happen when Launchpad automatically confirms a bug).
The importance of a bug can be assessed in a number of ways:
Insufficient information provided to set importance
The bug reporter may not have provided sufficient information about the impact of the bug to determine the importance during this phase of Triage; if this is the case ask the reporter for further information about the impact of the bug (you might also want to request information on how to reproduce if this is not provided - saves time later), mark 'Incomplete' and request that the reporter set the Status back to 'New' once they have provided the information (this stops the auto-expiry clock).
Duplicate bugs as an indicator of importance
At the moment the New/Undecided queue also recieves apport bugs for install/upgrade failures and for program crashes.
These often have minimal information provided by the bug reporter so looking at bug trends is often useful for these type of bugs - if we get a large number of bug reports of this type for a specific version of package X then this may be a good indicator that something is amiss - the Importance of the bug can then be assessed based on the number of bug reports.
Work is underway to automatically mark this type of bug as duplicates of each other which will remove some of the manual effort in this area.
Number of people affected as an indicator of importance
Looking at the 'Number of people affected by the bug' information may also be helpful - see Bug 970679 for an example of this.
Note that once someone else marks themselves as impacted by a bug it automatically get confirmed by Launchpad.
This first phase of triage will often pickup localized configuration or environment issues outside the control of the server team. Technically these are not bugs and should be marked 'Invalid' - however offering some advice on how to resolve is always a nice idea.
Local configuration change is the cause of the bug report
Some configuration changes that users make can cause bugs; some are due to incomplete configuration. Assessing which of these is the case can be tricky.
For example, in Bug 996968 the reporter has configured samba to bind to specific network interfaces; however the upstart configuration has not been updated to support this change. Marking this as 'Invalid' with some suitable advice on how to fix is correct.
If in doubt its probably best to set an initial importance based on the impact and then proceed to Phase 2 of triage to try to confirm - this may result in the bug being marked as 'Invalid' later in the process.
Using 'unofficial' software
Sometimes users manage to report bugs on packages from other sources - mark these an 'Invalid' and remind the user to only report bugs about packages distributed by Ubuntu.
This section covers other scenarios you may encounter during this phase of triage.
Package not up-to-date
If the bug reported is for a package that is not up-to-date and it looks like the -updates (or one waiting in -proposed) might resolve the issue ask the reporter to update and then try to reproduce the issue, mark 'Incomplete' and request that the reporter either set the bug back to 'New' if the problem persists or mark as 'Invalid' if the issue is resolved.
rmadison <pkg> is useful for determining the latest versions of a package:
rmadison samba [...] samba | 2:3.6.3-2ubuntu2 | precise | source, amd64, armel, armhf, i386, powerpc samba | 2:3.6.3-2ubuntu2.1 | precise-security | source, amd64, armel, armhf, i386, powerpc samba | 2:3.6.3-2ubuntu2.1 | precise-updates | source, amd64, armel, armhf, i386, powerpc samba | 2:3.6.3-2ubuntu2.2 | precise-proposed | source, amd64, armel, armhf, i386, powerpc samba | 2:3.6.5-3ubuntu1 | quantal | source, amd64, armel, armhf, i386, powerpc
You can see changelogs in Launchpad.
Sometime bugs get reported against older, out-of-support releases. This should be marked as 'Invalid' with a suitable message to the reporter to upgrade. They can always report the bug again if it still happens post upgrade.
Missing Apport information
Apport can provide additional information about the environment where the bug has occured. However, if the bug reporter has provided sufficient information its probably not worth requesting this information.
Logged against the wrong package
If the bug has been reported against the wrong package update the details to be for the correct source package. This may or may not move it outside of the server triage process.
Marking a bug as 'Incomplete' starts the timer for auto expiry of the bug. If the reporter does not respond within 60 days then the bug will be expired. Note that this timer gets reset of the bug is updated in any way - for example if the reporter does not set the status back to 'New' then it may still auto-expire - this is why reviewing the 'Incomplete/Undecided (with response)' queue is important still.
Moving to Phase 2
Assuming that the bug has enough information to set an 'Importance' then do so; this moves the bug into Phase 2 of the triage process.
Phase 2: Confirming the bug
OK, so we now think we understand the importance of the bug, but we need to gather more information about how to reproduce the issue and try to confirm that the bug exists.
Bugs with the following status combinations fall into this catergory:
- Incomplete/Decided (with response - hopefully enough to confirm the bug)
- Confirmed/Decided (although this state should only happen when Launchpad automatically confirms a bug)
'Decided' in this case means that the bug has an importance set.
Bugs should be worked on in order of ascending importance from 'Critical' to 'Wishlist'.
Not enough information to confirm
So you have tried to reproduce the bug by installing the package that the bug has been reported against and poking it a bit; but you don't see the same issue. Hopefully you now know enough about the package to request some further information from the bug reported about what steps they went through to hit the bug and what additional information might be useful in diagnosing the issue - log files or configuration files for example. If you don't ask someone else in the team for advice - someone will have useful knowledge.
Request this information from the bug reporter, mark the bug as 'Incomplete' and request that the report set the status back to 'New' once the requested information has been provided.
You may decide based on further information provided by the bug reporter that this bug is not actually a bug. Mark as 'Invalid' and provide some advice as to how to resolve the problem. Hopefully these will be of the category 'Local configuration change is the cause of the bug report' as in Phase 1.
Confirming the bug
Either the reporter has provided great information, someone else has already confirmed the bug or you have been able to reproduce - SUCCESS!
Mark the bug as 'Confirmed' with a suitable comment.
The end of triage is now in sight!
Ensure that the bug report has enough information in it for a developer to work on - you may want to supplement with information you have gathered when reproducing the bug.
Now mark the bug as 'Triaged' - and you are all done :-).
Sometimes people request new features using bugs; this is perfectly acceptable. Normally this bugs get marked as 'Triaged' with an importance of 'Wishlist'. If the feature is super important - and this is hard to guess if you don't know the package - then a different importance may be appropriate.
These do appear from time-to-time. Have a look in the 'Answers' section for the package (although these are not very well populated for the server packageset) and maybe check askubuntu.com.
Its worth considering converting the bug into an answer with an appropriate response if you have time - especially if users keep hitting the same configuration issues. Incorrectly set hostname with postfix is one that comes to mind.
We have this great tool for collecting information to support bug triage; if you keep asking for the same stuff then in all likelyhood it should be done by the apport hook.
Consider updating/writing one to collect the require log/configuration etc. These can be SRU'ed as well.
See https://wiki.ubuntu.com/Apport/DeveloperHowTo for more information.
When triaging bugs it is worth looking at upstream bug trackers to see if the same issue has been reported upstream. This includes the Debian bug tracker, and optionally other distro's bug trackers.
This can help with reproduction and confirmation of bugs.
EXPERIMENTAL: Filtering a Bug from the Triage report
There are bugs which the server team may not wish to progress futher or are assigned to other teams but still appear in the server triage report. These bugs should have the 'server-notriage' tag added to them with the following description
This bug has had the 'server-notriage' tag added to it. This means that the Ubuntu Server Triage team will take no further action on this bug. You should have been provided with an explanation as to why this has happened.