KernelBisection

Revision 2 as of 2011-01-21 22:17:30

Clear message

This page is under development

Finding a bad commit among a large number of them

The problem

You have made a release, and something broke. There are hundreds of patches committed since the previous tested release. How do you identify the bad one?

Getting set up

Have your tools working and a local repo (how to set up for the Maverick example)

be able to build kernels, on a fast build machine if available

have a reproducer, or have a cooperative tester in the community.

This process can take a lot longer if you have to wait for builds and testing.

What is bisection?

It's a successive splitting of a series of commits in order to locate the single one that caused a failure.

For more information, see the git help

git bisect --help

It will take a maximum of n build&test cycles to find the problem . . .

Check out your tree and get ready

If you want to follow along with the example, use the commands exactly as shown

git fetch
git checkout -b bisecting origin/master

Take a look first to see what you can learn

git log --oneline drivers/gpu/gdm/radeon

Is the problem limited to a subsystem? (radeon in this example) Examine the commits to that subsystem.

Is it obvious where the problem might be?

  • If so, it's worth just reverting that patch and testing.

Determine the known good and known bad commits

In the Maverick case, these are the release tags, which are: Ubuntu-2.6.35-25.43 - good Ubuntu-2.6.35-24.42 - bad

start the Bisection

start a bisection by using the command "git bisect start <bad> <good>"

git bisect start Ubuntu-2.6.35-24.42 Ubuntu-2.6.35-25.43

Give this test a version number

Now, you have to give this test a version number. This is done by editing the debian.master/changelog file.

The top line of that file has the version in it. Choose a version that is:

  • clearly a test
  • will be superceded by later kernels
  • has meaning to you in your bisection testing

I use my initials, plus an incrementing number, plus an indicator of the launchpad bug associated with the problem - thus, my first test version is:

2.6.35-25.44~spc01LP703553

The '~' is a special versioning trick that means that this kernel will be superceded and replaced by any version higher than 2.6.35-25.44, yet this version is considered higher than .44 - using this versioning makes sure that if a user tests our kernel they won't keep it around after the next update comes along.

edit the changelog and put in the version like this:

example

Do not commit the change you just made to the changelog. There's no need and it makes it harder to build subsequent tests.

Now build the kernel. (point to kernel building info) You can use a PPA, but it will probably take a lot longer to build.

Getting test results

Place the kernel package where your testers can get to it. let them know it's there. The Launchpad bug is a good place to track all of your testing.

using the test results

when you have the test results, you run git bisect again and say whether the test was good or bad:

git bisect good (fill this in)

and git will choose the next commit to be tested and reset to that point. You will also be told how many tests are remaining to find the bad commit.

Now edit the changelog with a new version and build the next test.

Repeat.

Notes

Shortcut: If you can determine a set of commits that are known good and known bad within the larger range, you can reduce the number of iterations required. (explain when this might make sense)

The output of the command "git bisect log" can be saved and later run as a shell script to return you to exactly where you were. So if you have to use your repo for something else while you are waiting for test results, you can recover your last state.