Revision 8 as of 2009-12-21 16:20:50

Clear message


Ubuntu Cloud images are generic "fresh install" disk images that are started up on various clouds and configured at boot time. Previous releases only allowed customization via a user provided script. That script would run at S99 (similar to traditional rc.local).

Instead of requiring the user to write a script to customize the image, we would like to provide a configuration file syntax to support common customization, such as installing additional packages, installing all updates ...

Release Note

Ubuntu Cloud Images can now be configured via a human readable config file syntax. Previously customization could only be done via provided scripts.


Previous Ubuntu releases only supported customization via a script file running late in the boot process. This had many problems:

  • writing shell scripts is complex, easy to screw up
  • only 16k of user data can be passed in a ec2 user-data script
  • script based customization means that the script-provider must maintain their scripts themselves.

By providing a Config syntax with a limited number of options, we can make sure that a given config file behaves the same on different versions of the OS. For some users the config syntax may provide all the customization they need.

User stories

  • As a system administrator, I want to deploy an AMI based on the latest Ubuntu Server Edition in EC2. I run the example commands provided, and my instance is started with the latest Ubuntu updates automatically applied. I can confidently allow public access to the instance, knowing that it is always up to date, and I do not accrue any data transfer charges on account of the update process.
  • As a casual user of EC2, I would like to easily make small changes to my instance in its first boot, without having to know details of how the boot process works. I'd like to install some extra packages from the archive and some from my own ppa.



On first boot, the boot hooks implementation (server-lucid-ec2-boothooks,ServerLucidCloudBoothooks) dumps the user data to a local file and then emits a user-data-yaml-config event. Upstart jobs will then read the configuration file to configure the running instance.

The following is a list of configuration items that we would like to support:

  • update the instance (apt-get upgrade)
  • add additional packages (apt-get install)
  • adding repositories (with shortcuts for a ppa) and installation of packages from these repositories.
  • mount EBS volumes at specified location
  • configure ephemeral storage usage
    • RAID 0 or LVM to give improved speed and size
    • define mount locations (/dev/sda2 on /var/log, /dev/sdb on /mnt ...)
  • 'runurl' support

Upstart jobs

Jobs depend on at least user-data-yaml-config event:

 start on user-data-yaml-config

The following upstart jobs are available:

  • apt_conf: configures apt
  • pkg_install: installs packages after apt_conf

Sample YAML configuration file

# Update apt database on first boot
# (ie run apt-get update)
# Default: true
apt_update: false

# Upgrade the instance on first boot
# (ie run apt-get upgrade)
# Default: false
apt_upgrade: true

# Add apt repositories
# Default: none

 # PPA shortcut:
 #  * Setup correct apt sources.list line
 #  * Import the signing key from LP
 #  See for more information
 - source: "ppa:user/ppa"    # Quote the string

 # Custom apt repository:
 #  * Creates a file in /etc/apt/sources.list.d/ for the sources list entry
 #  * [optional] Import the apt signing key from the keyserver 
 #  * Defaults:
 #    + keyserver:
 #    + filename: 00-boot-sources.list
 #    See sources.list man page for more information about the format
 - source: "deb lucid main restricted" # Quote the string
   keyid: 12345678 # GPG key ID published on a key server

 # Custom apt repository:
 #  * The apt signing key can also be specified 
 #    by providing a pgp public key block
 #  The apt repository will be added to the default sources.list file:
 #  /etc/apt/sources.list.d/00-boot-sources.list
 - source: "deb ./" # Quote the string
   key: | # The value needs to start with -----BEGIN PGP PUBLIC KEY BLOCK-----
      Version: SKS 1.0.10

      -----END PGP PUBLIC KEY BLOCK-----

# Add apt configuration files
#  Add an apt.conf.d/ file with the relevant content
#  See apt.conf man page for more information.
#  Defaults:
#   + filename: 00-boot-conf

 # Creates an apt proxy configuration in /etc/apt/apt.conf.d/01-proxy
 - filename: "01-proxy"
   content: |
    Acquire::http::Proxy "";

 # Add the following line to /etc/apt/apt.conf.d/00-boot-conf
 #  (run debconf at a critical priority)
 - content: |
    DPkg::Pre-Install-Pkgs:: "/usr/sbin/dpkg-preconfigure --apt -p critical|| true";

# Provide debconf answers
# See debconf-set-selections man page.
# Default: none
debconf_selections: |     # Need to perserve newlines
        # Force debconf priority to critical.
        debconf debconf/priority select critical

        # Override default frontend to readline, but allow user to select.
        debconf debconf/frontend select readline
        debconf debconf/frontend seen false
# Install additional packages on first boot
# Default: none
 - openssh-server
 - postfix


This config syntax parser will be implemented as a plugin to the generic boot hooks implementation. It will be installed as part of the image.

UI Changes

  • Newly developed programs under the server-lucid-xc2 (ServerLuciXc2) spec may expose supported config options as command line options when creating an instance.

Code Changes

A plugin to the boot hooks implementation will be implemented. It will be passed data that is determined to be intended for the plugin.

This new plugin will need to implement the parsing of the configuration, and acting on the supported configuration types. Most likely:

  • the plugin will be implemented in python
  • the config file format will be read by python ConfigParser


These changes will be done in a backwards compatible manner. Users that were creating scripts for customizing their images and passing those scripts to ec2 in user-data will see no changes.

Documentation will be provided on the wiki to point users to the more simplistic and user friendly configuration file syntax.

Test/Demo Plan


Unresolved issues


BoF agenda and discussion

UDS discussion notes

The current method by which users can configure UEC/EC2 images is via 'user-data'. user-data that begins with #! is executed by the appropriate interpreter at S99 runlevel. This is a very effect method for customization but one that requires a fair amount of expertise to utilize.

It would be nice to provide a config file like syntax that would allow: - what mirrors to use / what additional repos to add - whether or not updates are installed on first boot - ssh private keys to use (allowing user to dictate them, rather than polling for random keys from ec2-console) - extra packages to install

Note: this differs from server-lucid-ec2-boothooks as this would provide only for specific/static function. [1] would provide a hook into the images that could allow for building this blueprint.

Problems with the status quo

  • writing shell scripts is complex, easy to screw up -- simple things should be simple
  • only 64k of user data can be passed
  • maintenance issues

Use cases

  • package installation (Ubuntu repositories) *simple*
    • including tasks
    • add specific apt repositories
  • package installation (custom repositories) *simple*
  • install latest updates *simple*
  • Pass more than 64k of user data
    • landscape (registration data)
    • puppet (bootstrap (certificate/private key, autoregistration))
    • rightscale (rightscripts)
    • other config mgmt system: cfengine, bcfg, spline, capistrano
  • Advertise specific features [don't know]
  • dynamic DNS
  • EBS mounting and snapshots *simple*
    • udev script mounting of volume based on metadata in the volume
  • ephemeral storage mounting
    • RAID *simple*
  • asynchronous notification that the instance is up and running:
    • email
    • jabber/XMPP
    • simpledb, sqs
    • rabbitmq (message queue)
    • submission to a url *simple-low priority*
  • run custom scripts
    • at what point in the boot process?
      • pass an upstart job
  • pass any type of credentials through user-data (AWS, certificates, keytabs, ssh keys)
    • security issues
    • all data should be kept as safe as possible
      • config option to set perms on said data


  • - as early as possible parses various souces of user data - acts on it

Implementation ideas

  • share syntax, semantics with d-i preseeding
  • use puppet - not a generic solution to the issue, requires infrastructure
  • Simple key/value pairs format, read by plugins
  • Core features: bootstrapping process -- package install, repository add
  • Section / key-value pair format where each section would get read by a specific plugin
  • Credentials transfer: use S3-backend "safe" storage
  • store scripts, data in S3, access via URL
    • runurl
    • runurl
    • runurl