KarmicAufsUpdateManager

Revision 3 as of 2009-06-04 09:34:03

Clear message

Summary

In jaunty we started exploring how aufs can be used to test upgrades. For karmic this should go further to see if we can perform actual upgrades based on aufs that then sync back the writeable snapshot to the real filesystem (after upgrade or after the first boot).

During UDS it turned out that aufs is dropped in karmic. So a alterantive approach needs to be researched.

Release Note

This section should include a paragraph describing the end-user impact of this change. It is meant to be included in the release notes of the first release in which it is implemented. (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)

It is mandatory.

Rationale

Our packaging tools are not transactional by design. The fact that a maintainer script can run any shell commands and alter the system in any way they want make it impossible to provide a rollback feature. Filesystem snapshots help to fix this issue.

User stories

Joe is upgrading his ubuntu install and the universe package foo causes a failure. With the new snapshot based system the upgrade will roll-back to the last good state and Joe and wait for the fix for foo.

Design

Given how much flux is still in the space of kernel based union mounting we need to write a abstract interface that can then be used to implement the solution that the kernel decides on in the end. For now we should explore the options (vfs based unionfs, fuse) and see what works best.

Performance of aufs is a issue (we need real benchmark numbers here) and we need to talk about how a UI could look like if we want to give the user the option to test boot into a aufs upgraded system that has not been synced to the real system yet.

We need to fix https://bugs.launchpad.net/bugs/342451 in the process and how the free space calculation can be done. We still need to provide regular upgrades without aufs for e.g. the linux-image-server kernels (in hardy at least they do not have a aufs module).

Implementation

Writing a abstract interface that can then be used to implement the various backends (aufs, fuse-union) should be done this cycle. Then we need to explore the options we have for karmic and see what the kernel will use.

Test/Demo Plan

This change should be part of the regular automatic upgrade testing. A new profile is created that enables the filesystem snapshot feature as part of the test.

BoF agenda and discussion

Better error handling for failing upgrades

In jaunty we started exploring how aufs can be used to test upgrades. For karmic this should go further to see if we can perform actual upgrades based on aufs that then sync back the writeable snapshot to the real filesystem (after upgrade or after the first boot). Performance of aufs is a issue (we need real numbers here too) and we need to talk about how a UI could look like if we want to give the user the option to test boot into a aufs upgraded system that has not been synced to the real system yet.

We need to fix https://bugs.launchpad.net/bugs/342451 in the process and how the free space calculation can be done. We still need to provide regular upgrades without aufs for e.g. the linux-image-server kernels (in hardy at least they do not have a aufs module).

aufs in karmic?

Very likely

Options

- unionfs (got some problems in the past) - aufs (patch it in again) - dm-snapshot - fs with snapshot support (not going to happen) - unionfs-fuse (slow, locking issues?) - custom engineering (write a special upgrade-fuse-fs) - deltafs - clicfs - fuse results in significant performance degradation (10x slower?) and also

  • uses more memory

- fuse needs more memory than in kernel solutions

- provide various ways to create the snapshot depending on what is

  • available on the system (aufs, unionfs-fuse)

- after the upgrade, compute the delta between final and prev and save

  • that somewhere for later rollback and tar it up

- disk space overhead for maintaining snapshots? Think of SSDs or other

  • drives with a limited capacity.

- Compressing snapshots is a possibility; that may result in longer upgrade

  • times however

- integrate with friendly-recovery - integrate with apport on recovery (save a bug with the failed upgrade logs

  • for later submission)

- integrate with apport, kernel can not boot, etc and give the option

  • to recover then

* we need a solution for 8.04 -> 10.04 * look at the problem again at 10.04 * write abstraction layer * do benchmarks * integrate rollback in the admin menu

- good candidate for announcement notes - to get visibility that upgrades are

  • safe

open issues

- encrypted dirs - where to show the "rollback" option - aufs needs backport of dpkg for bug #342451 - python-fuse not in main - fuse would need arch: all compoenent as part of the upgrader


CategorySpec