ReliableRaid

Differences between revisions 16 and 17
Revision 16 as of 2009-12-15 10:59:35
Size: 6747
Editor: 77-21-62-108-dynip
Comment: adopting spec template
Revision 17 as of 2009-12-15 11:28:47
Size: 7626
Editor: 77-21-62-108-dynip
Comment:
Deletions are marked like this. Additions are marked like this.
Line 12: Line 12:
== Release Note ==

Event driven raid/crypt setup. (General hotplugging ability with support for booting more than only a simple root on mdX device setup degraded.)
Line 37: Line 40:
  *   * Angie installs ubuntu on a raid stripe for the rootfs (/) and a raid mirror with lvm for /home and swap. When one of the raid mirror members fails/is detatched while the system is powerd down. The system waits 20 seconds (default) for the missing member then resumes booting with a degraded raid. When the raid mirror member is reattatched later on (hotplugable interface) it gets automatically synced in the background.

  * Bono does the same but uses lvm on top of cryptsetup on the raids.
Line 41: Line 46:
For 1) This should be possible with a simple configuration change to the Event driven degration for mdadm should be possible with a simple configuration change to the
Line 47: Line 52:
For 2) The initramfs scripts and their failure hooks look like way to much work and overcomplicating things. A event based boot would have to be reimplemented with the initramfs scripts instead of using upstart to set up (crypt, raid, lvm, ... and) the rootfs from initramfs.  It would be good to adapt the upstart approach taken for 1) to set up the rootfs within the initramfs.  * cryptsetup is alrady set up event driven (with upstart not yet with udev)

For eventdriven bahaviour in the initramfs: The inintramfs scripts and their failure hooks look like way to much work and overcomplicating things. A event based boot would have to be reimplemented with the initramfs scripts instead of using upstart to set up (crypt, raid, lvm, ... and) the rootfs from initramfs. It would be good to adapt the upstart approach taken for 1) to set up the rootfs within the initramfs.

 * cryptsetup will need to be converted to the event driven setup in initramfs
Line 109: Line 118:

Summary

RAIDs (Redundant arrays of independent disks) allow systems to keep functioning even if some parts fail. You just plug more then one disk side by side. If a disk fails the mdadm monitor will trigger a buzzer, notify-send or email to notify that a new (spare) disk has to be added to up the redundancy again. All the while the system keeps working unaffectedly.

Release Note

Event driven raid/crypt setup. (General hotplugging ability with support for booting more than only a simple root on mdX device setup degraded.)

Rationale

Unfortunately ubuntu's md (software) raid configuration seems to suffer from a little incompleteness.

The assembling of arrays with "mdadm" has been moved from the debian startup scripts to the hotplug system (udev rules), however some bugs defy the hotplug mechanism and two things that are generally expected (as in just works in other distros) are missing functionality in ubuntu:

1. No handling of raid degration during boot for non-root filesystems (at all). (Boot simply stops at a recovery console)

  • There is no init script at all to start/run necessary regular (non-rootfs) arrays degraded. 259145 non-root raids fail to run degraded on boot

1. Only limited and bugy handling of raid degration for the rootfs.(Working only for plain no lvm/crypt md's and after applying a fix from the 9.10 release notes.).

  • The initramfs boot process is not (a state machine) capable of assembling the base system from devices appearing in any order and starting necessary raids degraded if they are not complete after some time.
    • 491463 upstart init within initramfs (Could handle most of the following nicely by now.)

    • 251164 boot impossible due to missing initramfs failure hook integration

    • 136252 mdadm, initramfs missing ARRAY lines

    • 247153 encrypted root initialisation races/fails on hotplug devices (does not wait)

    • 488317 installed system fails to boot with degraded raid holding cryptdisk

    • The proper mdadm --incremental option does not work in initramfs (not creating device nodes) 251663

Use Cases

  • Angie installs ubuntu on a raid stripe for the rootfs (/) and a raid mirror with lvm for /home and swap. When one of the raid mirror members fails/is detatched while the system is powerd down. The system waits 20 seconds (default) for the missing member then resumes booting with a degraded raid. When the raid mirror member is reattatched later on (hotplugable interface) it gets automatically synced in the background.
  • Bono does the same but uses lvm on top of cryptsetup on the raids.

Design

Event driven degration for mdadm should be possible with a simple configuration change to the mdadm package to hook it into upstart so a raid is started degraded if it hasn't fully come up after a timeout. (Would result in appropriately replacing the second mdadm init.d script present in the debian package. (Instead of dropping it.))

  • cryptsetup is alrady set up event driven (with upstart not yet with udev)

For eventdriven bahaviour in the initramfs: The inintramfs scripts and their failure hooks look like way to much work and overcomplicating things. A event based boot would have to be reimplemented with the initramfs scripts instead of using upstart to set up (crypt, raid, lvm, ... and) the rootfs from initramfs. It would be good to adapt the upstart approach taken for 1) to set up the rootfs within the initramfs.

  • cryptsetup will need to be converted to the event driven setup in initramfs

Implementation

  • The proper command (i.e. for boot scripts) to start *only specific* hotplugable raids degraded (i.e. the rootfs after a timeout from initramfs) may not be available. 251646 (Workaround maybe: removing a member a re-adding it with --incremental --run)

  • Using the legacy method to start an array degraded will break later --incremental (re)additions from udev/hotplugging.
  • The command "mdadm --incremental --scan --run" to start *all remaining* hotplugable raids degraded (something to execute only manually if at all!) does not start anything. 244808

  • mdadm still reads/depends on a static /etc/mdadm/mdadm.conf file containing UUIDs (in the initramfs !!!). It refuses to assemble any hotpluged array not mentioned and taged with the own hostname. (It does not default to just go assembling matching superblocks and run arrays (only) if they are complete.) This behaviour actually breaks the autodetection of every array newly created on a system, as well as pluging in a (complete) md arrays from another system. 252345 For updating that initramfs refer to: http://ubuntuforums.org/showthread.php?p=8407182

  • Ubuntu should make use of partitionable (/dev/md_dX type) arrays.
  • The ubuntu server manual says and claims that "If the array has become degraded, due to the chance of data corruption, by default Ubuntu Server Edition will boot to initramfs after thirty seconds. Once the initramfs has booted there is a fifteen second prompt giving you the option to go ahead and boot the system, or attempt manual recover." However,...
    • The kernel will never autostart a raid that is not reproducing a correct checksum, no matter if degraded or not. There is nothing to manualy recover about a degraded raid before it can be started. A recovery console is appropiate *after* starting a raid degraded has failed.
    • A replacement disk can be added at any time if a spare disk isn't already installed from the beginning. If the disks are not connected over a hotplugable interface, the system must be powered down for this. (Recovery console is also pointless in this case.)
    • If a drive fails while the system is powered up, by default nobody is notified and the system will simply reboot degraded afterwards anyway. Reason: 244810 inconsistency with the --no-degraded option.

    • The boot process will usually not be stopped (and should not) for something (like adding and syncing a new drive to the raid) that is designed to be done on live systems (quite a good thing to do as default).

UI Changes

None?

Code Changes

Code changes should include an overview of what needs to change, and in some cases even the specific details.

Migration

Include:

  • data migration, if any
  • redirects from old URLs to new ones, if any
  • how users will be pointed to the new way of doing things, if necessary.

Test/Demo Plan

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.


CategorySpec

ReliableRaid (last edited 2015-01-28 01:12:46 by penalvch)