PhasedUpdates

Revision 1 as of 2012-05-14 23:18:46

Clear message

Push out stable release updates to expanding subsets of the userbase so that serious regressions can be detected before updates are pushed to everyone, and the process stopped. The aim is for regressions to affect a smaller proportion of our userbase.

Release note

StableReleaseUpdates will no longer appear in update-manager at the same time for all machines. Instead a subset of machines will be selected at random to receive the update first. The update will only be made available to everyone if there are no serious regressions encountered by the first set of users. There is still a testing process completed by Ubuntu developers before any users receive the update.

Rationale

Giving an entirely new version of any widely used software program to our entire user base all at once is unnecessarily fraught with peril.

Instead, let's employ a phased update strategy wherein we provide the updated software to an ever-expanding set of random users. This pool of users will be grown as our confidence of that software update grows, fed by realtime information from the crash database and other potential sources.

This process will be developed in tandem with the efforts to increase testing of packages in -proposed so that we have more confidence in the updates we are pushing out. This process will add to that the ability to pull an update before a large number of users encounter it.

User stories

As an Ubuntu User
I want to encounter regressions in stable release updates less frequently
so that I can get my work done

As an Ubuntu developer
I want to get feedback about regressions before everyone installs the update
so that I can reduce the number of affected users

update-manager

update-manager will stop displaying and installing updates where the machine isn't in the current testing set.

Therefore update-manager needs to know whether the current machine is in the testing set for each update. The algorithm for calculating this is:

  • Check the package record for an X-Update-Percentage field. If it is not present then the current machine is in the set, so show the update.

  • If the field is present then the value will specify an integer from 0 to 100 that is the percentage of machines that should be in the current testing set. update-manager should calculate t = x/100 * (s^128 - 1) to give the threshold.

  • update-manager should calculate md5(machine-id + package name + package version). If this value is less than the threshold calculated in the previous step then the machine is in the current testing set for this package.

This algorithm requires only the package record and the machine id to execute and is fairly fast so shouldn't slow down the time to calculate the list of available updates significantly.

It is deterministic so that it will always answer the same for a particular package and version at a particular threshold. It does vary based on package and version though, so that the same set of users aren't always the ones that find the regressions.

Unresolved questions

  • Is the algorithm correct?
  • What do we want to call the new field?
  • What should update manager do if there is another version of the package available and the algorithm answers no for the latest? e.g. security update and newer package in -updates. If it won't install the -updates version, should it install the -security? Probably.
  • If update manager pops up once a week does it make the phased updates rather useless?
  • Given that it is machine based someone with multiple machines will see the updates at different times. Is that too confusing? Should there be a way to turn it off (opt in to testing)?

archive

On the server side the new X-Update-Percentage field needs to be populated when we want to phase an update.

Launchpad will insert this in to the package record. It therefore needs to know what value to insert. Where should it be stored in Launchpad?

An API will be added to Launchpad to set the value, and it will be controlled by ubuntu-sru (ubuntu-archive?).

A script will then be run to set these values. When a package is pushed in to -updates the script will start to increase the percentage over time, using some to-be-defined function of the age of the package, and possibly the urgency.

There will be an override to that aging that will allow ubuntu-sru to pause the script for a particular package that can be used when there are suspected regressions. Once it has beed decided what to do the package can either be superceded in updates, in which case the process will start again for the new version, or the rollout will be continued.

Unresolved questions

  • Where should the information be stored in Launchpad?
  • Who should be allowed to change the value?
  • How should the process be paused for a particular package?
  • What should the rollout curve look like?

Further development

  • An automatic link could be added from the error tracker to the rollout script so that it pauses propagation if there is a spike in crashes with the new version.

Replace that heading with headings for each thing you’re changing or specifying.

Checklist:

  1. Have you reviewed the bug reports for the relevant package? (Yes, this may take an hour or two. But you might be able to fix multiple bugs with a well-designed change.)
  2. If any user interface is involved, is it fully described? Include any wireframes or mockups.
  3. Have you had any new user interface, or new visible text, reviewed by a designer? (Or if you are a designer, have you had it peer-reviewed?)
  4. Is the change accessible? (For example, have you specified accessible labels for any graphic-only elements?)
  5. How will users learn the new way of doing things? Describe any help pages required, and any changes to the Ubuntu Web site or installer slideshow.
  6. Is any migration of data or settings required?
  7. How will the feature be tested? Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

Unresolved issues


CategorySpec