KernelBisection

Differences between revisions 2 and 3
Revision 2 as of 2011-01-21 22:17:30
Size: 3959
Editor: user-69-73-1-154
Comment:
Revision 3 as of 2011-01-24 22:29:28
Size: 7302
Editor: user-69-73-1-154
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
= Finding a bad commit among a large number of them = = How to bisect a sequence of commits to find the bad one =
Line 9: Line 9:
== Getting set up == == Required knowledge and tools ==
Line 11: Line 11:
Have your tools working and a local repo
(how to set up for the Maverick example)
The rest of this page assumes that you know how to fetch a kernel from the Ubuntu git repository, and build it, and that you have basic git skills. If you can't do that yet, try starting with [[https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel|this wiki page]].
Line 14: Line 13:
be able to build kernels, on a fast build machine if available == This example ==
Line 16: Line 15:
have a reproducer, or have a cooperative tester in the community.

This process can take a lot longer if you have to wait for builds and testing.
The commands in the example on this page use a real life example. In January of 2011, a kernel which was published to the -proposed pocket caused Radeon graphics to break for a number of users. Typing the commands as shown on this page will recreate the steps taken to find the bad commit in that release. The entire history of testing the bisected kernels for that regression [[https://bugs.launchpad.net/ubuntu/+source/linux/+bug/703553| appears in the bug]]
Line 29: Line 26:
It will take a maximum of n build&test cycles to find the problem . . . == Getting set up ==

You need to have a bug reproducer, or have a cooperative tester in the community. If you can't reliably determine whether the bug exists in a given kernel, bisection will not give meaningful results.

This process goes a lot faster if you can quickly build kernels and quickly have them tested. Using a fast build machine and having good communications with the testers will speed things up.
Line 36: Line 37:
git fetch
git checkout -b bisecting origin/master
git clone git://kernel.ubuntu.com/ubuntu/ubuntu-maverick.git
cd ubuntu-maverick

git checkout -b mybisect origin/master
Line 39: Line 41:

This creates a local copy of the maverick repository, and then creates a local branch named ''bisecting'' for your tests.
Line 42: Line 46:
The version with the problem is tagged Ubuntu-2.6.35-24.42.
The version which has the problem is tagged Ubuntu-2.6.35-25.43

First, lets take a quick look at the changes between the two:
Line 43: Line 52:
git log --oneline drivers/gpu/gdm/radeon git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43
Line 46: Line 55:
Is the problem limited to a subsystem? (radeon in this example)
Examine the commits to that subsystem.
Now, how many commits are in there?
Line 49: Line 57:
Is it obvious where the problem might be?
  If so, it's worth just reverting that patch and testing.
{{{
git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43 | wc
}}}

It says 325, but two of those are the startnewrelease and final changelog changes, so there are 323 commits, and the bad one is among them.

Sometimes you can easily find the problem is it's in a subsystem that only has changes from a few patches.
In this example, it's Radeon hardware that is affected, so try looking at the commits to the radeon driver:

{{{
git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43 drivers/gpu/drm/radeon/
}}}

That still shows eleven commits. Reverting each of those and testing will take longer than bisecting the entire set of changes, so we'll go ahead and do the bisection.
Line 58: Line 78:
== start the Bisection == == Start the bisection ==
Line 61: Line 81:
Line 62: Line 83:
git bisect start Ubuntu-2.6.35-24.42 Ubuntu-2.6.35-25.43 git bisect start Ubuntu-2.6.35-25.43 Ubuntu-2.6.35-24.42
Line 64: Line 85:

which results in this:

{{{
Bisecting: 162 revisions left to test after this (roughly 7 steps)
[dae1e6305dba4ff1e8574b3b6eb42613d409b460] olpc_battery: Fix endian neutral breakage for s16 values
}}}

This tells you that git has chosen the commit "olpc_battery: . . ." as the midpoint for the first bisection, and reset your tree so that is the top commit. Git is also telling you that there are about seven bisection steps left.
Line 67: Line 97:
Now, you have to give this test a version number. This is done by editing the debian.master/changelog file. Before you build this kernel for testing, you have to give it a version number. This is done by editing the debian.master/changelog file.

The top of that file now appears like this:

{{{
linux (2.6.35-25.43) UNRELEASED; urgency=low

  CHANGELOG: Do not edit directly. Autogenerated at release.
  CHANGELOG: Use the printchanges target to see the curent changes.
  CHANGELOG: Use the insertchanges target to create the final log.

 -- Tim Gardner <tim.gardner@canonical.com> Mon, 06 Dec 2010 10:45:38 -0700
}}}
Line 80: Line 122:
edit the changelog and put in the version like this: You also need to change the UNRELEASED to the maverick pocket, or it will not be accepted for your PPA build.

Edit the changelog and replace the entire text in the earlier box with this:
Line 82: Line 127:
example linux (2.6.35-25.44~spc01LP703553) maverick; urgency=low

  Test build for bisection of a Radeon regression

 -- Steve Conklin <sconklin@canonical.com> Mon, 24 Jan 2011 22:45:38 -0600
Line 85: Line 134:
Do not commit the change you just made to the changelog. There's no need and it makes it harder to build subsequent tests. Do not commit the change you just made to the changelog into your local git repo. There's no need and it makes it harder to build subsequent tests.
Line 87: Line 136:
Now build the kernel. (point to kernel building info) You can use a PPA, but it will probably take a lot longer to build. Now [[https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel|build the kernel]]. You can use a PPA, but it will probably take a lot longer to build.
Line 91: Line 140:
Place the kernel package where your testers can get to it. let them know it's there. The Launchpad bug is a good place to track all of your testing. Place the kernel package where your testers can get to it. let them know it's there. The Launchpad bug is a good place to track all of your testing. You can review [[https://bugs.launchpad.net/ubuntu/+source/linux/+bug/703553|the bug]] used for the example again.
Line 93: Line 142:
== using the test results == == Using the test results ==
Line 95: Line 144:
when you have the test results, you run git bisect again and say whether the test was good or bad: when you have the test results, you run git bisect again and say whether the test was good or bad. In this example case, the first test was bad, so we do the following:
Line 98: Line 147:
git bisect good (fill this in) git bisect bad
Line 101: Line 150:
and git will choose the next commit to be tested and reset to that point. You will also be told how many tests are remaining to find the bad commit. And git responds with:

{{{
Bisecting: 80 revisions left to test after this (roughly 6 steps)
[1829af44f4fe8600d6c9cde5fcb7a1345b201eaf] r6040: Fix multicast filter some more
}}}
Line 105: Line 159:
Repeat. Repeat until the bad commit is eventually identified.

At any time, you can use the command

{{{
git bisect log
}}}

to review all the work that's taken place.

This page is under development

How to bisect a sequence of commits to find the bad one

The problem

You have made a release, and something broke. There are hundreds of patches committed since the previous tested release. How do you identify the bad one?

Required knowledge and tools

The rest of this page assumes that you know how to fetch a kernel from the Ubuntu git repository, and build it, and that you have basic git skills. If you can't do that yet, try starting with this wiki page.

This example

The commands in the example on this page use a real life example. In January of 2011, a kernel which was published to the -proposed pocket caused Radeon graphics to break for a number of users. Typing the commands as shown on this page will recreate the steps taken to find the bad commit in that release. The entire history of testing the bisected kernels for that regression appears in the bug

What is bisection?

It's a successive splitting of a series of commits in order to locate the single one that caused a failure.

For more information, see the git help

git bisect --help

Getting set up

You need to have a bug reproducer, or have a cooperative tester in the community. If you can't reliably determine whether the bug exists in a given kernel, bisection will not give meaningful results.

This process goes a lot faster if you can quickly build kernels and quickly have them tested. Using a fast build machine and having good communications with the testers will speed things up.

Check out your tree and get ready

If you want to follow along with the example, use the commands exactly as shown

git clone git://kernel.ubuntu.com/ubuntu/ubuntu-maverick.git
cd ubuntu-maverick
git checkout -b mybisect origin/master

This creates a local copy of the maverick repository, and then creates a local branch named bisecting for your tests.

Take a look first to see what you can learn

The version with the problem is tagged Ubuntu-2.6.35-24.42. The version which has the problem is tagged Ubuntu-2.6.35-25.43

First, lets take a quick look at the changes between the two:

git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43

Now, how many commits are in there?

git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43 | wc

It says 325, but two of those are the startnewrelease and final changelog changes, so there are 323 commits, and the bad one is among them.

Sometimes you can easily find the problem is it's in a subsystem that only has changes from a few patches. In this example, it's Radeon hardware that is affected, so try looking at the commits to the radeon driver:

git log --oneline Ubuntu-2.6.35-24.42..Ubuntu-2.6.35-25.43 drivers/gpu/drm/radeon/

That still shows eleven commits. Reverting each of those and testing will take longer than bisecting the entire set of changes, so we'll go ahead and do the bisection.

Determine the known good and known bad commits

In the Maverick case, these are the release tags, which are: Ubuntu-2.6.35-25.43 - good Ubuntu-2.6.35-24.42 - bad

Start the bisection

start a bisection by using the command "git bisect start <bad> <good>"

git bisect start Ubuntu-2.6.35-25.43 Ubuntu-2.6.35-24.42

which results in this:

Bisecting: 162 revisions left to test after this (roughly 7 steps)
[dae1e6305dba4ff1e8574b3b6eb42613d409b460] olpc_battery: Fix endian neutral breakage for s16 values

This tells you that git has chosen the commit "olpc_battery: . . ." as the midpoint for the first bisection, and reset your tree so that is the top commit. Git is also telling you that there are about seven bisection steps left.

Give this test a version number

Before you build this kernel for testing, you have to give it a version number. This is done by editing the debian.master/changelog file.

The top of that file now appears like this:

linux (2.6.35-25.43) UNRELEASED; urgency=low

  CHANGELOG: Do not edit directly. Autogenerated at release.
  CHANGELOG: Use the printchanges target to see the curent changes.
  CHANGELOG: Use the insertchanges target to create the final log.

 --  Tim Gardner <tim.gardner@canonical.com>  Mon, 06 Dec 2010 10:45:38 -0700

The top line of that file has the version in it. Choose a version that is:

  • clearly a test
  • will be superceded by later kernels
  • has meaning to you in your bisection testing

I use my initials, plus an incrementing number, plus an indicator of the launchpad bug associated with the problem - thus, my first test version is:

2.6.35-25.44~spc01LP703553

The '~' is a special versioning trick that means that this kernel will be superceded and replaced by any version higher than 2.6.35-25.44, yet this version is considered higher than .44 - using this versioning makes sure that if a user tests our kernel they won't keep it around after the next update comes along.

You also need to change the UNRELEASED to the maverick pocket, or it will not be accepted for your PPA build.

Edit the changelog and replace the entire text in the earlier box with this:

linux (2.6.35-25.44~spc01LP703553) maverick; urgency=low

  Test build for bisection of a Radeon regression

 --  Steve Conklin <sconklin@canonical.com>  Mon, 24 Jan 2011 22:45:38 -0600

Do not commit the change you just made to the changelog into your local git repo. There's no need and it makes it harder to build subsequent tests.

Now build the kernel. You can use a PPA, but it will probably take a lot longer to build.

Getting test results

Place the kernel package where your testers can get to it. let them know it's there. The Launchpad bug is a good place to track all of your testing. You can review the bug used for the example again.

Using the test results

when you have the test results, you run git bisect again and say whether the test was good or bad. In this example case, the first test was bad, so we do the following:

git bisect bad

And git responds with:

Bisecting: 80 revisions left to test after this (roughly 6 steps)
[1829af44f4fe8600d6c9cde5fcb7a1345b201eaf] r6040: Fix multicast filter some more

Now edit the changelog with a new version and build the next test.

Repeat until the bad commit is eventually identified.

At any time, you can use the command

git bisect log

to review all the work that's taken place.

Notes

Shortcut: If you can determine a set of commits that are known good and known bad within the larger range, you can reduce the number of iterations required. (explain when this might make sense)

The output of the command "git bisect log" can be saved and later run as a shell script to return you to exactly where you were. So if you have to use your repo for something else while you are waiting for test results, you can recover your last state.

Kernel/KernelBisection (last edited 2023-08-14 22:19:48 by penalvch)