Summary

Almost all users of apt have seen the 'Hash Sum Mismatch' errors during an 'apt-get update'. This represents a race condition in apt. It is most likely exposed when a mirror is in the process of updating. The most common situation is:

Release Note

Ubuntu's public mirrors now contain an improved archive format. Coupled with updates to apt transient apt-update errors are a thing of the past.

Rationale

Prior to this change, the apt repository format suffered from a number of race conditions that caused apt to occasionally fail during an "apt-get update". Repositories have multiple index files which are fetched. "apt-get update" fails verification if index files don't match the exact hash present in another index file, even if they would work for accessing the repository. These race conditions cause significant problems in the development release that updates frequently:

After release, the -updates and -security pockets still suffer from this race condition, so "apt-get update" can still fail in production. This means that:

User Stories

Design

Repository Specification

The by-hash specification is currently an optional addition to any apt repository. But if the by-hash scheme is supported by a mirror for a given release, it must comply with the following:

Unresolved Issues

Fallback mechanism

Users use different repositories at once, and not all of them are expected to implement this scheme all at once. So we need a scheme to allow apt to fall-back automatically when by-hash isn't available on a per-mirror or per-(mirror, release) basis. Possible solutions:

Client Code Changes

Assumptions

Deployment Plan

Code Changes

  1. apt and debootstrap need to be updated to support the new changes and to fall back to the old behaviour if the new scheme is not available.
  2. A by-hash generator needs to be written, which can parse InRelease, create the by-hash entries and delete expired ones (trivial).

  3. The publisher needs to be updated to use the by-hash generator (presumed trivial).
  4. Mirror script packages need to be updated to sync the by-hash files and InRelease files in the correct race-free order. Candidates:

    • debmirror
    • ubumirror
  5. apt-ftparchive should be updated to generate by-hash files.

Migration

The order of these steps does not matter:

Rollback

If rollback is required, then it will be easy. apt and debootstrap changes can be reverted at any time. The publisher can pull out of generating by-hash at any time before release by reverting apt and debootstrap changes first. If Debian go a different route, then we can take their changes and reverse ours in debootstrap and apt. The publisher can stop publishing by-hash at any time, since clients will fall back to old behaviour automatically. If client by-hash support is released, then it may be advisable to keep publishing by-hash until EOL, in order to keep released clients race-free. This should not have much of a maintenance burden, since by-hash generation is trivial and independent.

Test/Demo Plan

We can test and demonstrate this facility without putting anything into production.

  1. Add patched apt and debootstrap to a PPA.
  2. Publish a Quantal mirror with by-hash information that is updated in a race-free manner. This will have to include InRelease re-signing by a testing (non-offical) key to eliminate all races.

  3. Publish debian-installer netboot images based on the patched apt and debootstrap.
  4. Base automated Quantal-based installer tests on the test by-hash mirror to verify that the races have gone away.

Further Discussion

A number of alternative schemes to fix this problem have been discussed (TBC: summarise them here). This particular scheme:

Caveats:

AptByHash (last edited 2012-07-11 10:24:24 by racb)