About EC2
EC2 is Amazon's cloud service, based on a Xen dom0 compute environment. It uses a modified early early Xen environment. To do local testing the closest environment is CentOS 5.0.
Preparations
Signing up
You need an Amazon account in order to use Amazon EC2 etc. Go to Amazon web service page, and click on the sign up and follow through. They will want your credit card (yes your own personal CC for billing) and will require a valid phone number and email address. The phone number will be verified by calling you back (a robot calls you and asks for the 4 digit pin currently on your screen) or sending you a text message with a pin you must enter to complete your sign up.
Security Groups -- firewalling for your instances
Before you start your first instance you will need to set up your security domain. The default security does not allow your new instances to accept SSH connections. In the main web interface https://console.aws.amazon.com/ec2/home select Security Groups, then tick the tickbox next to the default group. This will display a new pane at the bottom of the window (often so small it just looks like a title), pull the separator bar up and you can see which ports are allowed. Select 'SSH' and 'TCP' in the bottom row and click Add. Now all new instances will use this new security domain and have SSH enabled.
How much will it Cost?
That varies based off of region, zone, instance size, number of instances, compute cycles used, time left running, amount of data transfered (and whether its between regions, zones or internet in general), and ebs storage used. Basically go read Amazons documentation.
It isn't "cheap", it will only cost a few cents to boot and play around with and image, but it can add up fairly fast. Basically the general rule of thumb is real hardware is cheaper as long as it is being utilized. Where EC2 is cheaper is it is flexible so that you can scale up as needed, which can be cheaper than purchasing and maintaining hardware for peak loads.
Basic Terminology
- paravirtualized (pv-ops) - A specially modified kernel that makes up calls to the hyperviser for certain privileged operations and IO instead of being full virtualization. For the purposes of EC2 we use pv-ops to specifically refer to the newer upstream based pv-ops kernels, and Xen to reffer to the older full Xen patchset based kernels.
- HVM - Hardware based virtual machine. Full virtualization, not paravirtualization.
- Instance - A running system image
- Spot instances - a cheaper way to run AMIs if you don't want to run them immediately, basically you name the price and wait for them to start your image when loads drop low enough that they are willing to run it for that price.
- AMI - An operating system image, (has a default AKI, ARI).
- AKI - A registered kernel image, only select partners can register new AKIs.
- ARI - A registered ram disk, only select partners can register new ARIs.
- Region - A geographic region containing several zones (transferring data between instances running in different regions is the most expensive type of transfer)
- Zone - A more local grouping of data centers. Transferring data between instances running in different zones is cheaper than between regions, but transferring data between instances in the same zone is the cheapest.
- ebs - elastic block storage. A storage service provided by Amazon.
- Key Pair - A key that is setup so you can securely interact with your instances. NOTE: these are setup per region.
- security group - A set of rules about what net traffic can make it through to your instance (think amazon firewall, which is different than any firewall that your image might run). NOTE: these are setup per region.
- user data - Data that can be specified by the user to "customize" and instance. Basically this data is made available by Amazon on a special web server that is NATed to your image, and when the image boots it sucks in the user data using wget, and then executes it.
- pv-on-HVM - a special set of paravirtualized drivers that have been designed to run on a special HVM hypervisor. This is a hybrid of HVM and paravirtualized. You paravirtualized IO (for performance reasons) and every thing else is HVM.
About AMIs
AMIs are like iso images that are booted to bring up a virtual machine in the cloud. They can be remixed to create new AMIs (several people take the base Ubuntu AMI and modify it).
There are two basic types of AMIs
- instance store images - no permanent storage.
- ebs (elastic block storage) - an AMI that is backed by Amazon's ebs so that it can be persistent, shutdown/rebooted as long as the virtual machine is not terminated.
About the EC2 kernels
Amazons services make use of both pv-ops and HVM + pv-on-hvm driver kernels. The HVM kernels are used by the Cluster Compute cloud while, pv-ops is used by the regular cloud.
There are two different types of Ubuntu paravirtualized kernels.
- Hardy - Lucid are based on the full Xen patchset
- Maverick is based on a patched regular upstream pv-ops kernel.
A Third kernel is needed for the compute cloud, this is based on a standard none paravirtualized kernel + the pv-on-HVM drivers.
Bootable kernel images
EC2 kernels if they are being registered as an AKI will not work if they are a bzimages. This is because the Hypervisor that is used to load the kernel does not support bzimage. A bootable image can be made from a standard kernel build by either
- start with the vmlinux and do: strip vmlinux ; gzip vmlinux
- start with the bzImage and do: strip off the boot header, leaving only the gzipped portion of the kernel.
Once the gzipped kernel image is obtained it can be bundled and uploaded.
Note: the -EC2 topic branch kernels are built correctly to be registered as an AKI.
kernel images for use with in a pv-grub instance
If the kernel is to be used with an instance that has been setup for pv-grub (Maverick 10.10 and later) then a bzImage can be used, there is no stripping or bundling required, unless the kernel will be registered as an AKI.
Booting AMIs
When booting an AMI Xen Dom0 acts as the boot loader, so the kernel (AKI) and ramdisk (ARI) are specified separate from the actual operating system (AMI). Each AMI has a default AKI and ARI that is specified when the AMI is registered. An AMI can be booted with a different kernel by specifying the AKI and ARI to use at boot.
PV-Grub AMIs
Support for pv-grub images was added by Amazon during the Maverick development cycle and all Maverick and later kernels should support it. The default AKI, ARI for these images are provided by Amazon, which boots into a pv-grub environment which then looks at the AMI disk to find the real kernel.
It does this just like grub would it
- looks at /boot/grub/menu.lst to find the default kernel to load
- loads the default kernel and boots it via kexec
In Maverick both instance store and ebs backed AMIs are setup to use pv-grub. Choosing an ebs image will allow persisting your instance, and upgrading kernels much like on a regular machine.
Testing Kernels
To test new kernels pre-Maverick you must bundle, upload and register the kernel before it can be tested.
- On Maverick (pv-grub) using an ebs backed AMIs you can start an instance, install a kernel and reboot it as you would a regular kernel. Once done testing this way you should do a regular bundle and upload to ensure the kernel can work as a default AKI.
- pv-grub does not allow console interactions so it only boots the default kernel. There is a way to specify a fall back kernel if the default kernel fails. However if the default kernel boots part way then dies the fallback will not be executed.
Using the Web interface
Using the AMI tools
Useful links
Working with the EC2 topic branch
Getting the EC2 topic branch
Get a copy of the Ubuntu kernel git tree.
- git clone git://kernel.ubuntu.com/ubuntu/ubuntu-lucid.git
Checkout the ec2 branch
- git checkout --track origin/ec2 -b ec2
The branch will have a debian.ec2 directory which is used for building the kernel.
Building the EC2 kernel
The EC2 kernel is built using the same infrastructure as the standard Ubuntu kernel except the ec2 name is used.
- fdr binary-ec2
Uploading a test kernel
- Install the kernel to test on a test machine or into a chroot environment, to gain access to the kernel image and initrd.
Bundle and upload the kernel and initrd that was installed.
Getting the Xen patchset
The ec2 topic branch is based on the xen patchset that SuSE is using. It is currently pulled directly from the SuSE kernel of the day.
Visit http://ftp.suse.com/pub/projects/kernel/kotd/HEAD/src/
Download the kernel-source-<version>.src.rpm file where <version> is a kernel version string
- eg. kernel-source-2.6.32.3-0.0.44.81788a2.src.rpm
- unpack the rpm
- mkdir tmp cd tmp
rpm2cpio <kernel-source-<version>.src.rpm | cpio -idmv --no-absolute-filenames
- mkdir tmp cd tmp
- unpack patches.xen.tar.bz2
- tar -xvjf patches.xen.tar.bz2
- compare the unpacked xen patches to the patches in the ec2 git tree in debian.ec2/patches/
- For released kernels any update will need to be done as new patches to the EC2 tree. For development kernels update the corresponding git commit with a rebase.
- update the debian.ec2/patches dir to ensure it is kept up to date, so base patches are maintained.
Kernel Release Notes
Release specific information for Ubuntu EC2 kernels
- 8.04 (Hardy) - see below
- 8.10 (Intrepid) - see below
- 9.04 (Jaunty) - no EC2 kernel
- 9.10 (Karmic) - Xen patchset in ec2 topic branch
- 10.04 LTS (Lucid) - Xen patchset in ec2 topic branch
- 10.10 (Maverick) - paravirt-ops (pv-ops) in the -virtual kernel
Hardy
The hardy kernel is not maintained as described above. The EC2 kernel for Hardy is the Xen Dom0 kernel that is handled as a custom binary.
The Xen patchset for the Hardy kernel is obtained as described above for the topic branch EC2 kernels.
Intrepid
The intrepid kernel for EC2 is based on the Hardy kernel patch carried forward to intrepid.
Lucid
Does not require a ram disk to boot.
Maverick
Both instance store and ebs backed images use pv-grub to boot by default.