Stable_kABI

Discussion on Stable kABI

NOTE: This will be flushed out with the kernel team at the Jul08 Sprint.

The problem

In order for Linux to be successful in the marketplace, Independent Hardware Vendors (IHVs) and Independent Software Vendors (ISVs) require stability and predictability. Canonical recognizes this and works with our partners to enable their hardware and software in this stable and predictable manner. However, changes to the kernel to improve functionality, hardware enablement and security can often cause incompatibilities with third party modules both hardware and software.

The Kernel Application Binary Interface (kABI) is the primary hot button where these incompatibilities show up the most. If kABI not kept stable, it is possible that a third party module, weather hardware or software will fail to load after a routine kernel update. This best case would be disruptive. Worst case, the machine will not boot, resorting to rescue methods, known good kernels and lots of downtime. Many enterprise customers demand a stable kABI and have become accustomed this practice due to Red Hat Enterprise Linux (RHEL) & Suse Linux Enterprise Server (SLES) guaranteeing a stable kABI within a major version release.

kABI Issues

By default the Linux kernel builds with numerous kernel modules. Many modules are hardware device drivers, others provide traditional kernel functions, such as networking. Technically speaking, the Linux kernel is a monolithic kernel, that is all modules live in the same address space in the static part of the kernel and, as such have access to all kernel structures.

Modules interact with the kernel by accessing symbols that the kernel or other modules export. Currently the Ubuntu Hardy generic i386 kernel exports about 6900 symbols. Exported symbols along with the data structure layouts and versions info make up the Kernel Module Binary Interface, most commonly referred to as kABI. When compiling the kernel, it is assumed that all modules are compiled along with the static part of the kernel. This allows all data structures, kernel configuration, compiler version, and compiler flags to be in agreement, so all modules and the static kernel can work together. Modules compiled with a different configuration or compiler or even for a different kernel version cause problems. As a guard against these problems the kernel build system allows the versioning of kernel symbols. This mechanism is enabled in Ubuntu kernels. The version numbers are generated automatically by calculating checksums over the function and data structure declarations. The system is designed to not miss a change, so the symbol version changes any time the system suspects that the symbol has changed. This prevents against silent breakage would typically result in kernel memory corruption.

When a module requires a symbol with the wrong version number, the module is not loaded into the kernel. While this will result in loss of functionality it is less harmful than memory corruption.

Generally the kernel symbol versions issue is not a problem for the Linux community, as well as Ubuntu. Ubuntu recompiles the complete kernel, including all shipped modules, for every kernel it releases, so this volatility is not a problem for Canonical. Recompiling the module will usually adapt to changed options or differently laid-out data structures. The problem does affect parties who provide kernel modules that are not part of the Ubuntu distribution, normally partners, both IHV & ISV. This happens because because these modules cannot be recompiled by the Ubuntu. In addition, changes in the kAPI could occur and require changes to the module source code.

Both of these cases are highly undesirable for anyone who provides kernel modules that are not part of the product distribution. Customers expect to receive timely updates that fix critical bugs, security issues hardware enablement.

Historical kABI Data

Going back and pulling repo data on the last two LTS releases, Dapper and Hardy respectively there were numerous kABI bumps to shipped kernels. The table below illustrates the number of changes that went into each of the kABI bumped kernels (marked in red) vs. the non kABI bumped.

Dapper Versions Showing kABI Bumps

Kernel Version

Date

ABI changed

Security Changes

Updates Changes

Reason ABI changed

Subsys for ABI

2.6.15-24.40

06/09/06

Yes

21

40

Both

x86-ACPI-ec SPARC-ALL (NR_CPUS)

2.6.15-24.41

06/12/06

No

0

8

NA

2.6.15-24.42

06/12/06

No

0

1

NA

2.6.15-25.43

06/14/06

Yes

0

5

Updates

x86-IDE (NUM_HWIF's increased)

2.6.15-26.44

07/07/06

Yes

4

43

Both

All over (enabling of CPUSETS)

2.6.15-26.45

07/11/06

No

2

5

NA

2.6.15-26.46

08/02/06

No

3

14

NA

2.6.15-26.47

08/07/06

No

4

28

NA

2.6.15-27.48

09/11/06

Yes

2|3

Security

Squashfs, VFS

2.6.15-27.50

11/29/06

No

6

1

NA

2.6.15-28.51

01/22/07

Yes

12

3

Security

VFS, ipv6-netfilter

2.6.15-28.52

02/28/07

No

3

0

NA

2.6.15-28.53

03/13/07

No

0

2

NA

2.6.15-28.54

04/26/07

No

7

0

NA

2.6.15-28.55

05/10/07

No

1

0

NA

2.6.15-28.56

07/17/07

No

12

0

NA

2.6.15-28.57

07/18/07

No

2

0

NA

2.6.15-29.58

08/29/07

Yes

6

0

Security

ipv6

2.6.15-29.59

09/21/07

No

5

0

NA

2.6.15-51.63

10/23/07

Yes

0

7

Updates

md, usb-serial

2.6.15-51.64

12/05/07

No

0

3

NA

2.6.15-51.65

01/17/08

No

12

1

NA

2.6.15-51.66

02/11/08

No

0

1

NA

2.6.15-52.67

05/20/08

Yes

12

0

Security

snd-core, VFS

2.6.15-52.69

07/15/08

No

1

1

NA

2.6.15-52.71

08/25/08

No

4

3

NA

Hardy Versions Showing kABI Bumps

Kernel Version

Date

ABI changed

Security Changes

Updates Changes

Reason ABI changed

Subsys for ABI

2.6.24-18.32

05/19/08

Yes

5

0

Security

VFS

2.6.24-19.33

05/21/08

Yes

0

31

Updates

SSB, HID

2.6.24-19.34

06/05/08

No

0

0

NA

2.6.24-19.36

07/15/08

No

9

1

NA

2.6.24-20.37

07/17/08

Yes

0

34

Updates

ACPI, NET

2.6.24-20.38

07/28/08

No

0

13

NA

2.6.24-20.39

08/11/08

No

0

13

NA

2.6.24-21.40

08/12/08

Yes

0

1

Updates

RFKILL

2.6.24-19.41

08/25/08

No

2

0

NA

2.6.24-21.42

08/25/08

No

0

14

NA

2.6.24-22.45

08/27/11

Yes

10

0

Security

various (added field in task_struct)

While this data gives an impression on the amount of churn in the kernel it specifically does not tell us exactly how many of the security or updates were actually kABI breakers. This is due to the existing policy of bumping kABI anytime is “was needed”, coupled with loose kABI tracking at changelog commit time.

kABI Stability Options

Why stabilize kABI? In short to keep 3rd party kernel modules from breaking every time Ubuntu releases a kernel that bumps kABI. The classic argument from the upstream Linux community is clean your code up and get it accepted in the upstream kernel. While that would be the optimal choice it is not practical for a variety of reasons, everything from IP issues to upstream refusal. Additionally there are numerous high end Enterprise customers that write their own hooks into the kernel for various operations and wish to have a stable kABI.

Option 1

Need kernel team discussion here Some suggestions:

  • Limit to LTS releases
  • Stable kABI for the 1st two years, then the expectation is to move to the next LTS release.
  • Guarantee a subset of symbols, possibly by polling the lead vendors and analyzing what symbols they actively use.
  • Tightening up our kABI bump policy and bumping only after assessing the risk of not doing it.
  • We could (and probably should) apply a different policy during development than after release, and this should be reflected in the documentation.
  • There is significant research to be done to establish whether this subset would be small enough to make this workable. Some places to start:
    • Get with Pat McGowan on ODM modules.

    • Dissect our own restricted modules
    • Modules currently distributed out of tree in source form, including those in linux-ubuntu-modules, packaged in Debian (using module-assistant or similar) or using DKMS (is Dell using this on Ubuntu currently?)
    • Survey Ubuntu users via the forums, surveymonkey, etc. and ask them to post their third-party modules for examination
    • Examining in-tree modules in the driver subsystems most likely to be added later (e.g. storage, network, graphics) to see which symbols they use

Option 2

Need kernel team discussion

Draft kABI Policy Statement

(rtg) - My opinion is that the kernel ABI is a good thing.

  1. kABI is an important mechanism for detecting when public structures or function prototype changes affect external module compilation. It is functioning as designed.
  2. For LTS releases (and possibly as a standard policy), I think that we should make our ABI policy more rigorous, e.g., specifically differentiate each release with an ABI bump, regardless of ABI changes. Having a completely separate kernel for each ABI provides a mechanism for recovery in the face of regressions.
    • BenC: That's only characteristic on pre-intrepid. With Current intrepid, there is always a known-good kernel backed up for just such problems. It also bothers me to have ABI as a built-in failsafe, since there is no guarantee that the last ABI was even working (hence the reason that we never relied on this in the first place).
  3. The perception from third party vendors of kABI dependent modules that ABI changes cause instabilities is incorrect. In my opinion, attempting to refactor patches to avoid ABI changes is more likely to create regressions or add new bugs. Its likely that vendors view kABI changes as an indication of instability because of the churn in their process. If they had better packaging options, then perhaps they would not get so excited. Furthermore, with respect to stability, there is little difference between a patch that requires an kABI bump and one that does not. Why does a vendor feel they must retest against new ABI releases, but not for non-ABI bumped releases?
    • BenC: I somewhat agree with this statement. In fact, there's more chance that a non-ABI breaking patch will cause regression than the other way around, mainly because those patches usually touch the meat of code rather than the interfaces. However, we can't always `expect that third-party vendors will be able to deliver source or even source+blob type drivers for DKMS or similar distribution methods. MID is a perfect example of this sort of thing. When we change ABI, it's not so much that the vendor thinks they need to re-certify us against their driver, as it is that they have policy of fully testing anything that goes out the door. If they have to rebuild a module for the new ABI, they can't just assume it is OK.


CategoryKernel

KernelTeam/Stable_kABI (last edited 2008-12-19 15:48:19 by p5B2E67EB)