HardyUbiquityReliability

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

We will add code to ensure that Ubiquity is more resistant to complete failure and gives the user more control in the event of failure. We will also check the error handling of complex components, like migration-assistant.

Release Note

Major work has gone into making Ubiquity more reliable.

Rationale

There are a number of cases where Ubiquity could but currently does not allow the user to continue, such as migration-assistant or grub-installer failing. There have also been a number of bugs that we could have defended against in more depth.

Use Cases

  • Mark tries to install the bootloader to the Linux device name /dev/sda, which does not exist. Rather than crashing at the grub-install step, the installer prevents him from making this incorrect choice.
  • Evan tries to install Ubuntu, and the migration-assistant step fails due to an unforeseen bug. Rather than needing to start the installer again, he presses "Skip" and is able to carry on.

Design

  • Allow the user to repeat steps on failure.
  • Audit migration-assistant for error handling.
  • Check file copy integrity.
  • Allow the user to retry the bootloader installation, and possibly use EDD.
  • Insufficient file space check.
  • Possibly port the d-i lowmem check to Ubiquity.
  • Do a pass over the bugs with the most duplicates.

Implementation

  • Allow the user to repeat steps on failure.

We will add a "report, retry, skip" dialog to the step handling code that will allow the user on step failure to either report a bug, try running the step again, or skip the step entirely.

  • Audit migration-assistant for error handling.

We will modify the code in ma-ask and ma-apply to ensure that they handle IFS properly and that they account for errors at all phases of the code.

  • Check file copy integrity.

The install code will be modified to check the md5sums of copied files; since it already reads them into memory in order to copy them, it should be possible to do this relatively efficiently. If there is an md5 mismatch, the user will be presented with the option of retrying the copy or aborting the install. A preseed option to skip the check, since it will slow down the installer to some extent, will be added for OEMs.

  • Allow the user to retry the bootloader installation, and possibly use EDD.

Ubiquity will be modified to validate the grub install location string that the user gives in the advanced dialog using regular expressions. The text box will be turned into a combobox and a list of choices based on the available Linux device names, and making use of os-prober, will be added. For example:

- Seagate 30GB
- Partition 1 (Windows)
- Partition 2 (Ubuntu)

Known problem devices, such as those with XFS filesystems, will be excluded from the list.

We will also evaluate using EDD to better handle the Linux and grub device mapping issues that have occurred in the past. When EDD has been enabled in the kernel in the past, it has led to boot delays on some hardware, so we may need to attempt to do this from userspace using libx86.

  • Insufficient file space check.

The partitioner will be modified to check if the sum of the partitions excluding /home is less than the size of the read-only filesystem plus some wiggle room.

  • Possibly port the d-i lowmem check to Ubiquity.

We will evaluate porting the d-i lowmem component check to Ubiquity.

  • Do a pass over the bugs with the most duplicates.

Using Brian Murray's top number of bugs with the most duplicates report and other generated reports, we will make a pass over the highest profile bugs in Ubiquity (that are not already covered here) and work on fixing or mitigating the risks posed by them.

Test/Demo Plan

This will be tested by trying to break the installer using common bugs. For example, one test would be to put garbage data in the field for the grub install location.

BoF agenda and discussion

repeating steps or skipping them on failure.
- report, retry, skip

validation in bootloader installer
- menu of choices based on the available linux device names
  - Seagate 30GB
  - Partition 1 (Windows)
  - Partition 2 (Ubuntu)
  - gives us the opportunity to exclude swap
  - exclude known problem devices (XFS, etc)

ma-ask
 - should be audited for error handling

OLD_IFS="$IFS"
IFS=new
for ...; do
        IFS="$OLD_ILFS"
        ...
        IFS=new
done
IFS="$OLD_IFS"

checking that we copied files corectly (md5sum)
 - retrying file copy
 - parallel md5sum checking
 - should be preseedable away

bootloader installation failure
 - retry option
 - Colin to talk to kernel team about EDD (saving the drive list from the BIOS)

< 3gb check, warning if the sum of the partitions excluding /home is less than the size of the /rofs + a bit.
d-i lowmem check for ubiquity?

pass on bugs with the most duplicates

unrelated, but still important:
- automatically launching the report bug interface when in failsafe X.
- d-i post-installation report-bug can be hooked into apport properly.


CategorySpec

HardyUbiquityReliability (last edited 2008-08-06 16:14:58 by localhost)