Summary

RAIDs (Redundant arrays of independent disks) allow systems to keep functioning even if some parts fail. You just plug more then one disk side by side. If a disk fails the mdadm monitor will trigger a buzzer, notify-send or send email to notify that a (new spare) disk has to be added to up the redundancy again. All the while the system keeps working unaffectedly.

Release Note

Event driven pure and secure UUID based raid/crypt assembly. (Hotplugging that supports booting more than only the simple "root filesystem directly on md device" case, if arrays are degraded.)

Rationale

Unfortunately Ubuntu's md (software) raid configuration seems to suffer from a little incompleteness.

The assembling of arrays with "mdadm" has been transitioned from the Debian startup scripts to the hotplug system (udev rules), however some bugs defy the hotplug mechanism and other things that are generally expected (as in just works in other distros) are missing functionality in Ubuntu:

  1. No handling of raid degradation during boot for non-root filesystems (i.e. /home) at all. (Boot simply stops at a recovery console.)
    • The Debian init script has been removed but no upstart job has been created to start/run necessary regular (non-rootfs) arrays degraded. 259145 non-root raids fail to run degraded on boot

  2. Only limited and buggy handling of raid degradation for the rootfs.(Working only for plain no lvm/crypt md's (and with 9.10 only after applying a fix from the release notes).
    • 539597 bogus debconf question "mdadm/boot_degraded"

    • The initramfs boot process is not (a state machine) capable of assembling the base system from devices appearing in any order and starting necessary raids degraded if they are not complete after some time.
      • 491463 upstart init within initramfs (Could handle most of the following nicely by now.)

      • 251164 boot impossible due to missing initramfs failure hook integration

      • 247153 encrypted root initialisation races/fails on hotplug devices (does not wait)

      • 488317 installed system fails to boot with degraded raid holding cryptdisk

  3. No notification of users/admins about raid events, i.e. disk failures (email question suppressed during install without any buzzer/notify-send replacement.)
    • 535417 mdadm monitor feature broken, not depending on local MTA/MDA or using wall/notify-send

  4. Blocked and flawed hotplugging mechanisims.
    • Mdadm is setting up arrays according to unreliable superblock information. (Device "minor" numbers, labels and hostnames in superblocks are not sure to be unique and can be outdated.) This is combined with the idea of fixing the unreliability by limiting array assembly with information from mdadm.conf. (Defining PARTITIONS, ARRAY, HOMEHOST lines.) Consequently this forces setup tools, admins and installers to create mdadm.conf files and subjects them to the exact same reliability problems. The only thing mdadm can (and should) rely on when assembling is the high probability of uniqueness of UUIDs. (i.e. don't rely on admins, tools or install scripts to set up a mdadm.conf, use only UUIDs as references for device nodes/userspace.)

      Mdadm reads/depends on a /etc/mdadm/mdadm.conf file (also in the initramfs !!!). It refuses to assemble any array not mentioned there and tagged with the own hostname in the superblocks. This behaviour actually breaks the autodetection even of arrays newly created on a system, as well as connecting a (complete) md array from another system. (mdadm does not default to just go assembling matching superblocks and run arrays (only) if they are complete.) For instructions on updating the initramfs manualy refer to: http://ubuntuforums.org/showthread.php?p=8407182

    • This cause a large amount of filed bugreports (taged [->UUIDudev] ), plus:

    • 252345 raid setups fail due to mdadm.conf with explicit ARRAY statements and HOMEHOST !=any

    • 136252 mdadm.conf w/o ARRAY lines but udev/mdadm not assembling arrays

    • 550131 initramfs missing /var/run/mdadm dir (loosing state)

    • 576147 if array is given a name, a strange inactive md device appears instead of the one created upon reboot

  5. mdadm not included on live CD
    • 44609 RAID not implemented in ubiquity (use alternate CD instead)

Regarding booting with degraded raids: Note that no problem really arises in a hotpluggable system if an array required to boot is run degraded after a reasonable timeout and a missing drive comes up later. It can simply be (re-)added to the array (and will be synced in the background if any writes have occurred yet). The admin however should get a notification not only if a drive did not come up timely but in any case of drive failure.

There really isn't any problem that would require BOOT_DEGRADED=NO or a rescue console/repair prior to boot only in case a disk fails *while the system is powered down*. There is however a problem of not notifying anybody in all other cases of disk failures. (The system stays running without any notification about lost redundancy and will happily reboot straight up in those cases, regardless of BOOT_DEGRADED.)

There are tasks that do require an admin action *after* the raid has done what it is designed to (keep the system working unaffectedly, preventing data loss in case of failure):

Those tasks are generally best done with the system either fully up and running, or powered down completely. The boot process should never need to be stopped (and should not) for something (like adding and syncing a spare drive to the raid) that is especially designed to be done on live systems. Besides it breaks the automatic activation of spare disks supported by mdadm.

Use Cases

Design

Event driven degradation for non-root filesystems should be possible with a configuration change to the mdadm package to hook it into upstart, so a raid is started degraded if it hasn't fully come up after a timeout. (This should appropriately replace the second mdadm init.d script present in the Debian package, instead of just dropping it.)

Cryptsetup is already set up event driven with upstart during boot. (And triggered upon udev events on the desktop level, but not yet on system level.)

Initramfs:

We need an event driven boot also in the initramfs. Current initramfs scripts and their failure hooks are very limited and too complicated to handle the general case. Reimplementing an event based boot with the initramfs scripts can be avoided using upstart also in the initramfs (to set up crypt and auth devices, raid, lvm, ... and mount the rootfs).

Implementation

"mountall" functionality related:

UI Changes

None necessary.

Code Changes

Code changes should include an overview of what needs to change, and in some cases even the specific details.

Migration

Include:

Test/Demo Plan

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.


CategorySpec

ReliableRaid (last edited 2015-01-28 01:12:46 by penalvch)