2012-06-05-grub-in-cloud-images

Owner: DaveWalker

Incident Description

The update for https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/978464 appears to have exposed a bug in either grub or apt that is breaking scripts based on the Amazon images from cloud.ubuntu.com. dpkg is giving the upgraded config file prompt upon configuring grub-pc ("A new version of configuration file /etc/default/grub is available, but the version installed currently has been locally modified.")

Crisis Response Team

  • Scott Ritchie
  • Christopher James Halse Rogers
  • Kate Stewart
  • Martin Pitt
  • Steve Langasek
  • Dave Walker

Events

All times are in UTC.

  • 2012-06-06 02:50 - on #ubuntu-devel Scott Ritchie(YokoZar1) commented about regression with grub on the AMI images.

  • 2012-06-06 03:05 - Scott Kitterman flagged !regression-alert (note: ubottu list is out of date and needs scrub)
  • 2012-06-06 03:08 - Chris started investigation.
  • 2012-06-06 03:08 - Kate starts of incident report, and pings to cloud team members (smoser, utlemming)
  • 2012-06-06 03:09 - Scott R opens Bug 1009294 to track issue with https://launchpad.net/ubuntu/+source/grub2/1.99-21ubuntu3.1

  • 2012-06-06 03:18 - Chris figured out issue is with ucf managed conf file. Work around is setting the environment variable UCF_FORCE_CONFFNEW (or UCF_FORCE_CONFFOLD) which should bypass, and silently install the new (or leave the old) file.
  • 2012-06-06 03:29 - Chris adds workaround to http://askubuntu.com/questions/146921/how-do-i-apt-get-y-dist-upgrade-without-a-grub-config-prompt

  • 2012-06-06 03:55 - Chris pulls Martin into brainstorming on what reasonable solutions might be.
  • 2012-06-06 04:04 - Martin recommends reverting the grub SRU and replace it with an SRU that only modifies the grub configuration if it is not locally modified.
  • 2012-06-06 04:24 - Kate hands off tracking to Martin.
  • 2012-06-06 04:32 - Chris uploads a reversion of the changes in grub 1.99-21ubuntu3.1 into -proposed

  • 2012-06-06 05:21 - Steve Langasek points out that this is not a bug from 1.99-21ubuntu3.1 in particular, but would have happened with any grub update. Either the AMI has a corrupted ucf database, or there is a bug in ucf; it might be related to the rather unusual invocation of ucf with --sum-file=/usr/share/grub/default/grub.md5sum

  • 2012-06-06 05:30 - Martin removes grub 1.99-21ubuntu3.2 source from precise-proposed, and the built binaries from the accepted queue (i. e. the binaries of 3.2 never got published)
  • 2012-06-06 05:43 - Neither Steve nor Chris can reproduce the prompt with manually modifying /etc/default/grub. Suspecting that the AMI build process changes the UCF database.

  • 2012-06-06 05:58 - Steve sends updated analysis, reproducer, and workaround to the bug (https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1009294/comments/4)

  • 2012-06-06 06:20 - Steve confirms bug in AMI build scripts and points out solution how to fix them. (https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1009294/comments/5)

  • 2012-06-06 08:21 - Dave Walker comments that he doesn't believe this is a new incident, and is tracked via LP #759545.

  • 2012-06-06 08:23 - Martin hands off tracking to Dave Walker.
  • 2012-06-06 10:06 - Dave determines that the image building process should do something to update ucf.
  • 2012-06-06 10:38 - Dave discovers that the potential duplicate is infact a different (but similar) issue, but considered by those effected to be the long standing tracking bug.
  • 2012-06-06 13:21 - Discussions within the server team how to resolve this.
  • 2012-06-06 21:54 - Discussions between Steve and Dave determine that "DEBIAN_FRONTEND=noninteractive dpkg-reconfigure grub-pc'" towards the end of the image build process would avoid this issue.
  • 2012-06-06 22:03 - Original reporter requested to try this command on an instance prior to upgrade for cross-validation.
  • 2012-06-06 22:45 - Original reporter confirms that the work around (& trial bug fix) resolves the issue. Scheduled to be added to the next spin of images, pending further testing.

Successes

  • Caught early.
  • Outlined wider issues in escalation process.
  • As this isn't really an SRU regression, the IncidentReport wasn't really required; however - it outlined some issues with the current process which proved useful. This will allow ongoing quality to improve, based on recommendations.

Problems

  • ubottu notify list for regression-alert is out of date
  • This was a known issue tracked throughout the Precise cycle, but was being tracked under an inaccurate bug description causing confusion.
  • The person assigned to work on it, could not reproduce the issue as outlined. Whilst frequent discussions were attempted, they didn't prove fruitful; suspected to be a communication of the bug issue and manpower issue.
  • The original bug had an importance of Medium which is accurate for the package in question (https://wiki.ubuntu.com/Bugs/Importance), but higher than Medium for the Server flavour; meaning it was treated with less importance.

Recommendations

  • Determine who should be highlighted with the !regression-alert factoid.
  • Raise the importance of the original bug.
  • Dedicated discussions between the Server and Foundations team to discuss and escalate issues that are pertinent to release.
  • Improved validation of long standing bugs to ensure that the description remains accurate.

IncidentReports/2012-06-05-grub-in-cloud-images (last edited 2012-06-06 22:50:24 by the)