Revision 1 as of 2009-11-20 16:37:30

Clear message

The current method by which users can configure UEC/EC2 images is via 'user-data'. user-data that begins with #! is executed by the appropriate interpreter at S99 runlevel. This is a very effect method for customization but one that requires a fair amount of expertise to utilize.

It would be nice to provide a config file like syntax that would allow: - what mirrors to use / what additional repos to add - whether or not updates are installed on first boot - ssh private keys to use (allowing user to dictate them, rather than polling for random keys from ec2-console) - extra packages to install

Note: this differs from server-lucid-ec2-boothooks [1] as this would provide only for specific/static function. [1] would provide a hook into the images that could allow for building this blueprint.


Problems with the status quo

  • writing shell scripts is complex, easy to screw up -- simple things should be simple
  • only 64k of user data can be passed
  • maintenance issues

Use cases

  • package installation (Ubuntu repositories) *simple*
    • including tasks
    • add specific apt repositories
  • package installation (custom repositories) *simple*
  • install latest updates *simple*
  • Pass more than 64k of user data
    • landscape (registration data)
    • puppet (bootstrap (certificate/private key, autoregistration))
    • rightscale (rightscripts)
    • other config mgmt system: cfengine, bcfg, spline, capistrano
  • Advertise specific features [don't know]
  • dynamic DNS
  • EBS mounting and snapshots *simple*
    • udev script mounting of volume based on metadata in the volume
  • ephemeral storage mounting
    • RAID *simple*
  • asynchronous notification that the instance is up and running:
    • email
    • jabber/XMPP
    • simpledb, sqs
    • rabbitmq (message queue)
    • submission to a url *simple-low priority*
  • run custom scripts
    • at what point in the boot process?
      • pass an upstart job
  • pass any type of credentials through user-data (AWS, certificates, keytabs, ssh keys)
    • security issues
    • all data should be kept as safe as possible
      • config option to set perms on said data


  • - as early as possible parses various souces of user data - acts on it

Implementation ideas

  • share syntax, semantics with d-i preseeding
  • use puppet - not a generic solution to the issue, requires infrastructure
  • Simple key/value pairs format, read by plugins
  • Core features: bootstrapping process -- package install, repository add
  • Section / key-value pair format where each section would get read by a specific plugin
  • Credentials transfer: use S3-backend "safe" storage
  • store scripts, data in S3, access via URL
    • runurl
    • runurl
    • runurl