SmallerUpdates

Summary

Smaller Updates - Updates in the form of patches, I think suse is doing something like this? (even great when only as a last resort for dialup users). The new Mandrake 2006 has support for delta rpms (contains the binary difference of two packages). Apparently this is how Mandrake does their online updates now.

Rationale

Users on slow or expensive network connections (e.g. dial-up or GPRS/3G mobile data) may not have time or money to download updates. Since a typical update only changes a subset of the files in the package, there is good potential for offering updates in a more bandwidth-efficient manner.

Use cases

  1. Gloria has just installed a breezy system from CD. She has a 56k dialup but most security updates are > 1 MiB in size, i.e. 3 hours of full bandwidth to update them all. She has no incentive to download them, although she understand that this will expose her to security risks.

  2. Hans wants to help testing the next development release of Ubuntu, but cannot effort this, because his internet connection has a monthly traffic limit of 5 GB. Exceeding that limit costs 0,10 € per MB. Last time he ignored that limit, he got presented an internet bill of 150,00 €.

Scope

Design

This topic has been often discussed in the Debian community. See for example this debian-dpkg thread from 2000 and the associated .debiff proposal, Debian bug#128818, https://wiki.ubuntu.com/APTPackageDeltas and zsync.

(michael.vogt): debian bug #128818 ("WISH: apt-get update using rsync protocol") is fixed in debian/experimental. The apt in it uses a ed-patch mechanism that gives a very good compression rate. Doing diffs for packages (instead of Package/Source index files) is a lot harder and a different mechanism must be applied for that. (michael.vogt2): Some comments on zsync http://lists.debian.org/debian-devel/2005/11/msg00023.html. It looks like a big problem is that any system like this must preserve the md5sum of the deb to not break our authentication theme.

(SebWills) I would like to add the following idea: given that the user who is downloading an update to package X already has the previous version installed, but has no particular reason to still have the .deb for the installed version, it would make sense to use a system which could use the installed files as the basis for the diff. Put simply, the "update deb" file would contain checksums only (not contents) for files (except config files) which are the same as in the previous version and are therefore expected to be intact on the user's system. The archive would then need to contain a chain of these "update debs" as well as the full .debs. When updating, a client would determine whether there exists in the archive a chain of "update debs" starting from their installed version reaching to the new version, and, if not, fall back to downloading the full .deb.

See https://wiki.ubuntu.com/PaulSladen/Succinct for various expansions on ideas and https://wiki.ubuntu.com/apt-sync.

(Alezaro) Another tool is "debdelta" that was released as unstable in Debian: http://packages.debian.org/unstable/devel/debdelta

Implementation

Code

Data preservation and migration

Outstanding issues

BoF agenda and discussion


CategorySpec

SmallerUpdates (last edited 2008-08-06 16:18:35 by localhost)