Revision 5 as of 2009-11-27 14:05:57

Clear message


Ubuntu Cloud images are generic "fresh install" disk images that are started up on various clouds and configured at boot time. Previous releases only allowed customization via a user provided script. That script would run at S99 (similar to traditional rc.local).

Instead of requiring the user to write a script to customize the image, we would like to provide a configuration file syntax to support common customization, such as installing additional packages, installing all updates ...

Release Note

Ubuntu Cloud Images can now be configured via a human readable config file syntax. Previously customization could only be done via provided scripts.


Previous Ubuntu releases only supported customization via a script file running late in the boot process. This had many problems:

  • writing shell scripts is complex, easy to screw up
  • only 16k of user data can be passed in a ec2 user-data script
  • script based customization means that the script-provider must maintain their scripts themselves.

By providing a Config syntax with a limited number of options, we can make sure that a given config file behaves the same on different versions of the OS. For some users the config syntax may provide all the customization they need.

User stories

  • As a system administrator, I want to deploy an AMI based on the latest Ubuntu Server Edition in EC2. I run the example commands provided, and my instance is started with the latest Ubuntu updates automatically applied. I can confidently allow public access to the instance, knowing that it is always up to date, and I do not accrue any data transfer charges on account of the update process.
  • As a casual user of EC2, I would like to easily make small changes to my instance in its first boot, without having to know details of how the boot process works. I'd like to install some extra packages from the archive and some from my own ppa.



On first boot, the boot hooks implementation (server-lucid-ec2-boothooks,ServerLucidCloudBoothooks) will pass a config parser data that is identified as configuration data.

The following is a list of configuration items that we would like to support:

  • update the instance (apt-get upgrade)
  • add additional packages (apt-get install)
  • adding repositories (with shortcuts for a ppa) and installation of packages from these repositories.
  • mount EBS volumes at specified location
  • configure ephemeral storage usage
    • RAID 0 or LVM to give improved speed and size
    • define mount locations (/dev/sda2 on /var/log, /dev/sdb on /mnt ...)
  • 'runurl' support


This config syntax parser will be implemented as a plugin to the generic boot hooks implementation. It will be installed as part of the image.

UI Changes

  • Newly developed programs under the server-lucid-xc2 (ServerLuciXc2) spec may expose supported config options as command line options when creating an instance.

Code Changes

A plugin to the boot hooks implementation will be implemented. It will be passed data that is determined to be intended for the plugin.

This new plugin will need to implement the parsing of the configuration, and acting on the supported configuration types. Most likely:

  • the plugin will be implemented in python
  • the config file format will be read by python ConfigParser


These changes will be done in a backwards compatible manner. Users that were creating scripts for customizing their images and passing those scripts to ec2 in user-data will see no changes.

Documentation will be provided on the wiki to point users to the more simplistic and user friendly configuration file syntax.

Test/Demo Plan


This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

UDS discussion notes

The current method by which users can configure UEC/EC2 images is via 'user-data'. user-data that begins with #! is executed by the appropriate interpreter at S99 runlevel. This is a very effect method for customization but one that requires a fair amount of expertise to utilize.

It would be nice to provide a config file like syntax that would allow: - what mirrors to use / what additional repos to add - whether or not updates are installed on first boot - ssh private keys to use (allowing user to dictate them, rather than polling for random keys from ec2-console) - extra packages to install

Note: this differs from server-lucid-ec2-boothooks as this would provide only for specific/static function. [1] would provide a hook into the images that could allow for building this blueprint.

Problems with the status quo

  • writing shell scripts is complex, easy to screw up -- simple things should be simple
  • only 64k of user data can be passed
  • maintenance issues

Use cases

  • package installation (Ubuntu repositories) *simple*
    • including tasks
    • add specific apt repositories
  • package installation (custom repositories) *simple*
  • install latest updates *simple*
  • Pass more than 64k of user data
    • landscape (registration data)
    • puppet (bootstrap (certificate/private key, autoregistration))
    • rightscale (rightscripts)
    • other config mgmt system: cfengine, bcfg, spline, capistrano
  • Advertise specific features [don't know]
  • dynamic DNS
  • EBS mounting and snapshots *simple*
    • udev script mounting of volume based on metadata in the volume
  • ephemeral storage mounting
    • RAID *simple*
  • asynchronous notification that the instance is up and running:
    • email
    • jabber/XMPP
    • simpledb, sqs
    • rabbitmq (message queue)
    • submission to a url *simple-low priority*
  • run custom scripts
    • at what point in the boot process?
      • pass an upstart job
  • pass any type of credentials through user-data (AWS, certificates, keytabs, ssh keys)
    • security issues
    • all data should be kept as safe as possible
      • config option to set perms on said data


  • - as early as possible parses various souces of user data - acts on it

Implementation ideas

  • share syntax, semantics with d-i preseeding
  • use puppet - not a generic solution to the issue, requires infrastructure
  • Simple key/value pairs format, read by plugins
  • Core features: bootstrapping process -- package install, repository add
  • Section / key-value pair format where each section would get read by a specific plugin
  • Credentials transfer: use S3-backend "safe" storage
  • store scripts, data in S3, access via URL
    • runurl
    • runurl
    • runurl