CloudInfrastructure

Differences between revisions 16 and 17
Revision 16 as of 2010-10-28 20:35:45
Size: 8938
Editor: host194
Comment:
Revision 17 as of 2010-10-28 22:41:15
Size: 10517
Editor: 71
Comment:
Deletions are marked like this. Additions are marked like this.
Line 198: Line 198:
Conclusions:
 * Some kernel patches (setns, ipvs, ns-cgroup-removal) are heading upstream
   * kernel team may backport those into natty
   * ns cgroup is being deprecated - should be turned off
     * MUST be associated with taking the clone-children control file patch to replace ns cgroup functionality
 * For more forward-looking and experimental lxc patches,
   * Create a kernel based on natty hosted on kernel.ubuntu.com
   * Create a ppa with both custom kernel and lxc package to exploit it
   * Examples of functionality:
     * user namespace
     * containerized syslog
     * tinyproc (see below)
 * Investigate solutions for /proc and /sys containerization
   * One attractive solution was to separate proc from container-safe tinyproc
     * could be a mount option
       * a CAP_HOST_PROC capability is required for mounting full proc
       * tinyproc does not provide /proc/sysrq-trigger, for instance
 * Networking:
   * We should let libvirt handle creation of bridge
   * Someone should investigate getting netcf working in debian+ubuntu
     * To play nice with networkmanager
 * Container auto-start on boot
   * Let libvirt handle it
 * Meeting schedule for Friday to investigate a libvirt binding for liblxc
   * The libvirt-lxc binding is deemed insufficient
 * Upstart script for lxc
   * We should see if we can let libvirt handle it all
 * Action: find someone willing to work on a script on top of lxc for easing container creation
 * Action: find someone to push top/ps/netstat/etc containerization patches upstream

This page is here to collect together proceedings from sessions as part of the 'Cloud Infrastructure' track at the Natty UDS in Orlando, Florida.

Please add proceedings by doing the following:

  • Add a new section with the name of the session.
  • Add the outcomes of the session as a collection of bullet points. The goal of the proceedings is to really focus on decided outcomes, so please try to keep them crisp.

Thanks!

Proceedings

Bootstrap puppet from deployment service (for UEC and more)

cloud-server-n-install-bootstrap-puppet

  • write cron job to pool installation service DB for expected agent requests.
  • upstream to support csr extensions (ie puppet token) for puppet agent.
  • extend puppet ca to validate csr extensions (ie puppet token).
  • extend installation service to include puppet token in installation procedure.

Eucalyptus next steps

natty commitments:

  • ebs root
  • disable api termination
  • instance initiated shutdown behaviory
  • import keypair
  • some user management similar to Identity Access Management
    • improved command line tools for create delete ...
  • tagging (not yet commiting to full aws tagging support)
  • snapshot sharing (user sharing snapshots, not instances)
  • default instance types
    • knowledge of m1.2xlarge ...
    • ephemeral devices for ephemeral1..4
  • CreateImage

  • loader support - up to eucalyptus from ubuntu Ubuntu / Eucalyptus TODOs for natty cycle
  • review of legacy operating systems and virtio disk/network support
  • review of ubuntu patches
  • review of java dependencies
  • euca2ools 2.0 / boto updates (without this most of features above are not exposed)
  • Eucalyptus / Ubuntu to collaborate / share test suitesthank
  • somehow need to get to gwt 2.1 (or 2.0)

How can Ubuntu use awstrial

Improvements to the Ubuntu Cloud Images

cloud-server-n-cloud-images

  • Scott will document how to seed cloud-init from kernel command line
  • Scott and John will get ubuntu cluster compute types to EC2 during Natty Cycle
  • image root fs size for natty resized to 10G

Installation service for physical nodes deployments (UEC and more)

cloud-server-n-install-service

  • Overall architecture outlined.
  • 4 potential projects identified:
    • - fai - cobbler - openqrm - lp:uec-provisioning.
  • More discussion required to gather additional requirements.

On-going maintainance with puppet (for UEC and more)

cloud-server-n-config-mgmt-with-puppet

  • deploy puppet master along installation service.
  • package mcollective to get scalable distributed command infrastructure (eg trigger puppet runs).
  • write puppet modules to configure UEC components (CLC, CC, NC).
    • available in /etc/puppet/modules/.
  • link puppet master to installation service via external_nodes to get classes assigned to each system.
  • tie installation service to puppet master to gather list of available classes.

Openstack packaging

  • Meta packaging to ensure toolkit approach
    • nova-*-mysql, -sqlite, and (pre?)-depend on sane defaults.
  • Packaging branch to be moved to a new team, and Ubuntu ultimately owns the packages.
  • Commits under-go merge proposals for peer review.
  • Upstartify
  • Apport hook
  • Monitoring scripts (collectd / munin)
  • Release cycle alignment
    • Natty freezes February 24; releases April 28.
    • Branch Nova code by Ubuntu feature freeze (Feb 24), backport fixes until FinalFreeze (Apr 14)

  • Current state of Nova and Swift packaging: it works. It's in pretty good shape. NASA is using it.

Components of OpenStack NOVA:

  • Compute
  • API
  • Volume
  • objectstore (simple implementation of S3; it may go away and be replaced by Swift)
  • instancemonitor (monitors loads put on server by virtual machines and graphs them, storing them in the objectstore)
  • scheduler
  • network
  • python-nova
  • nova-common
  • There is no packaging dependency on rabbitmq-server, but there is a runtime dependency. For a DB, sqlite, postgress or mysql will work. Rick considers which DB is packaged and configured to be a Ubuntu decision.
  • SWIFT has a different set of components.
  • Dave: Current default -- sqlite; could we consider mysql?
  • Soren: Wants to make sure install is no-questions; sqlite was "dead simple"
  • Packaging targets single-node use case
  • More complex deployments should be handled through the install service

Issues to handle off line:

  • How to package the databases, authentication packages
  • precede-ability

Rebundling and other cloud utilities

UEC EC2 compatibility

  • Testing framework to ensure EC2 Compatibility
    • - Possibly use txAWS.

Monitoring probes and alerting service (for UEC and more)

cloud-server-n-monitoring-alerting

Actions:

  • move collectd to main.
  • should munin go to universe (probably not yet)
  • find a graphing solution (munin, graphite, reckonater (omniti - not packaged, visage).

Web scale enhancements

Hadoop packaging

CDH 3 will be used as the fundations for Ubuntu. Cloudera packages will be reviewed and tested.

  • review other CDH packages for improvments in the user experience and Ubuntu integration:
    1. hbase
    2. pig
    3. hive
    4. hue
    5. oozie
    6. sqoop
  • review zookeeper patches and look which patches should be integrated in the Debian/Ubuntu packages.
  • file bugs, write patches and have them integrate in CDH3.
    • publish hadoop packages into a PPA and point Cloudera to it for integration.
  • integration with installation service (whenever that is ready).

Ubuntu desktop cloud images

In order to get fully supportable "Ubuntu" images into main, we'll need the following actions.

  • [action] get freenx serve into the main archive (currently in a ppa)
  • [action] get an open nx client into archive (actually, qtnx is already in archive)
  • [action] create desktop images with freenx server from main

In order to have a very slick user demo, we have to:

  • [action] have image offer open nx client via web connection
  • [action] canonical services perhaps contact NoMachine about an optimal OSS nx client

  • [action] must have a windows client, perhaps mac client too (proprietary one might be acceptable here?)
  • [action] we would need the unity-qt implementation since we don't have 3d acceleration in the cloud

Handle virtual networking in the cloud

Openstack gap analysis

Distributed logging

Use rsyslog as the fundation for building distributed logging.

  • Support relp in main (MIR librelp)
  • write puppet recipes to automatically configure rsyslog
  • Integrate in UEC:
    • configure central rsyslog on the ClC
    • configure aggegrator rsyslog on the CC
    • configure central logging via rsyslog on the NC, SC, Walrus
    • use syslog for all UEC components
    • write a script (grep++) to automatically track messages related to an InstanceId

  • look at package Reconoiter in ubuntu to use it for reporting and presentation
  • look at package log analyzer

Application checkpoint/restart

Application checkpoint/restart in linux (linux-cr.org) provides the ability to checkpoint, restart, and migrate application and system containers. This provides a very lightweight mechanism for load-balancing in the cloud.

Actions:

  • Create a ppa with the kernel and userspace packages needed to experiment with c/r
    • Create a project in lp
    • When ppa is up, Gustavo will blog about how to use it
  • Create bindings for libvirt to use lxc.sf.net
    • Then libvirt can handle
      • auto-start of containers on boot
      • creation of a bridge for containers

UEC Web interface

Make LXC ready for production

Conclusions:

  • Some kernel patches (setns, ipvs, ns-cgroup-removal) are heading upstream
    • kernel team may backport those into natty
    • ns cgroup is being deprecated - should be turned off
      • MUST be associated with taking the clone-children control file patch to replace ns cgroup functionality
  • For more forward-looking and experimental lxc patches,
    • Create a kernel based on natty hosted on kernel.ubuntu.com
    • Create a ppa with both custom kernel and lxc package to exploit it
    • Examples of functionality:
      • user namespace
      • containerized syslog
      • tinyproc (see below)
  • Investigate solutions for /proc and /sys containerization
    • One attractive solution was to separate proc from container-safe tinyproc
      • could be a mount option
        • a CAP_HOST_PROC capability is required for mounting full proc
        • tinyproc does not provide /proc/sysrq-trigger, for instance
  • Networking:
    • We should let libvirt handle creation of bridge
    • Someone should investigate getting netcf working in debian+ubuntu
      • To play nice with networkmanager
  • Container auto-start on boot
    • Let libvirt handle it
  • Meeting schedule for Friday to investigate a libvirt binding for liblxc
    • The libvirt-lxc binding is deemed insufficient
  • Upstart script for lxc
    • We should see if we can let libvirt handle it all
  • Action: find someone willing to work on a script on top of lxc for easing container creation
  • Action: find someone to push top/ps/netstat/etc containerization patches upstream

Containerize ptrace/kill

The security team has an interest in smarter ptrace controls, however these do not mesh with this work. They want to mostly prevent ptrace, but allow ptrace_traceme (ab)use by/for debuggers, tracers, and fault handlers. Containers will prevent tasks inside the container from allowing ptrace by a task outside the container. User namespaces would likely be too coarse-grained, globbing together an entire KDE or wine session, allowing all tasks in one such session to ptrace each other.

However, the containerization of kill and ptrace are deemed 'a good thing.' Kees recommends pushing the patchset.

UEC QA for Natty

Containers in UEC

KVM/Libvirt hypervisor work

Cloud-init / cloud-config improvements

Automated server testing

UDSProceedings/N/CloudInfrastructure (last edited 2010-11-16 22:10:36 by 99-156-85-10)