OpenStackHA

Revision 24 as of 2013-05-01 08:46:19

Clear message

WORK IN PROGRESS

Overview

The Ubuntu OpenStack HA reference architecture is a current, best practice deployment of OpenStack on Ubuntu 12.04 using a combination of tools and HA techniques to deliver high availability across an OpenStack deployment.

The Ubuntu OpenStack HA reference architecture has been developed on Ubuntu 12.04 LTS, using the Ubuntu Cloud Archive for OpenStack Grizzly.

Juju Deployment

Before you start

Juju + MAAS

The majority of OpenStack deployments are implemented on physical hardware; Juju uses MAAS (Metal-as-a-Service) to deploy Charms onto physical service infrastructure.

Its worth reading up on how to setup MAAS and Juju for your physical server environment prior to trying to deploy the Ubuntu OpenStack HA reference architecture using Juju.

How many servers?

Well for HA (N+1 resilience where possible) with all OpenStack services, you will need 28 servers. If you are trying this out Virtual-MAAS might be a good option. This should be announced as a beta-preview soon.

Configuration

All configuration options should be placed in a file named 'config.yaml'; this is the default file that juju will use from the current working directory.

Note that all deployed services need to use the Ubuntu Cloud Archive for Grizzly as the source of packages.

Charms

Although all of the charms to support deployment of OpenStack are available from the Juju Charm Store, its worth branching the bzr branches that support them locally; this means that if you need to tweak a charm for your specific deployment, its much easier.

mkdir precise
(
  cd precise
  bzr branch lp:charms/ceph
  bzr branch lp:charms/ceph-osd
  bzr branch lp:~openstack-charmers/charms/precise/mysql/ha-support mysql
  bzr branch lp:~openstack-charmers/charms/precise/rabbitmq-server/ha-support rabbitmq-server
  bzr branch lp:~openstack-charmers/charms/precise/hacluster/trunk hacluster
  bzr branch lp:~openstack-charmers/charms/precise/keystone/ha-support keystone
  bzr branch lp:~openstack-charmers/charms/precise/nova-cloud-controller/ha-support nova-cloud-controller
  bzr branch lp:~openstack-charmers/charms/precise/cinder/ha-support cinder
  bzr branch lp:~openstack-charmers/charms/precise/glance/ha-support glance
  bzr branch lp:~openstack-charmers/charms/precise/quantum-gateway/ha-support quantum-gateway
  bzr branch lp:~openstack-charmers/charms/precise/swift-proxy/ha-support swift-proxy
  bzr branch lp:~openstack-charmers/charms/precise/swift-storage/ha-support swift-storage
  bzr branch lp:~openstack-charmers/charms/precise/nova-compute/ha-support nova-compute
  bzr branch lp:~openstack-charmers/charms/precise/openstack-dashboard/ha-support openstack-dashboard
)

Base Services

Ceph

Overview

Ceph is a key infrastructure component of the Ubuntu OpenStack HA reference architecture; it provides network accessible, resilient block storage to MySQL and RabbitMQ to support HA, as well as providing an natively resilient back-end for block storage (through Cinder) and for image storage (through Glance).

Configuration

A Ceph deployment will typically consist of both Ceph Monitor (MON) Nodes (responsible for mapping the topology of a Ceph storage cluster) and Ceph Object Storage Device (OSD) Nodes (responsible for storage data on devices). Some basic configuration is required to support deployment of Ceph using the Juju Charms for Ceph:

ceph:
  fsid: '6547bd3e-1397-11e2-82e5-53567c8d32dc'
  monitor-count: 3
  monitor-secret: 'AQCXrnZQwI7KGBAAiPofmKEXKxu5bUzoYLVkbQ=='
  osd-devices: '/dev/vdb'
  osd-reformat: 'yes'
  source: 'cloud:precise-updates/grizzly'
ceph-osd:
  osd-devices: '/dev/vdb'
  osd-reformat: 'yes'
  source: 'cloud:precise-updates/grizzly'

In this example, Ceph is configured with the provided fsid and secret (these should be unique for your environment) and will use the '/dev/vdb' block device if found for object storage. Ceph is being sourced ('source') from the Ubuntu Cloud Archive for Grizzly to ensure we get the latest features.

The Ceph MON function is provided by the 'ceph' charm; as the monitor-count is set to '3' Ceph will not bootstrap itself and start responding to requests from clients until at least 3 service units have joined the ceph service. Note that the ceph charm will also slurp up and run OSD's on any available storage; for large deployments you might not want todo this but for proof-of-concept work its OK to just run with storage provided directly via the ceph service.

Additional storage is provided by the 'ceph-osd' charm; this allows additional service units to be spun up which purely provide object storage. Recommended for larger deployments.

Deployment

First, deploy the ceph charm with a unit count of 3 to build the Ceph MON cluster:

juju deploy -n 3 local:ceph

and then deploy some additional object storage nodes using the ceph-osd charm and relate them to the cluster.

juju deploy -n 3 local:ceph-osd
juju add-relation ceph ceph-osd

All of the above commands can be run in series with no pauses; the charms are clever enough to figure things out in the correct order.

Bootnotes

By default, the CRUSH map (which tells Ceph where blocks should be stored for resilience etc..) is OSD centric; if you run multiple OSD's on a single server, Ceph will be device failure resilient but not server failure resilient as the default 3 replicas may be mapped onto OSD's on a single host.

Read the upstream documentation on how to tune the CRUSH map for your deployment requirements; this might land as a feature into the charm later on but for now this bit requires manual tuning.

MySQL

Overview

MySQL provides persistent data storage for all OpenStack services; to provide MySQL in a highly-available configuration its deployed with Pacemaker and Corosync (HA tools) in an Active/Passive configuration. Shared block storage is provided by Ceph.

NOTE: For 12.04, its worth running with the Quantal LTS kernel (3.5) to pickup improvements in the Ceph rbd kernel driver.

Configuration

The only additional configuration required by the MySQL charm is a VIP and subnet mask which will be used as the access point for other services to access the MySQL cluster:

mysql:
  vip: '192.168.77.8'
  vip_cidr: 19

Deployment

The MySQL charm is deployed in-conjunction with the HACluster subordinate charm:

juju deploy -n 2 local:mysql
juju deploy local:hacluster mysql-hacluster
juju add-relation mysql ceph
juju add-relation mysql mysql-hacluster

After a period of time (it takes a while for all the relations to settle and for the cluster to configure and start), you should have a MySQL cluster listening on 192.168.77.8.

BOOTNOTES

Various Active/Active MySQL derivatives exist which could be used in place of MySQL; however for the Raring/Grizzly release cycle, only MySQL is in Ubuntu and fully supported by Canonical. Future releases of this architecture may use alternative MySQL solutions.

RabbitMQ

Overview

RabbitMQ provides a centralized message broker which the majority of OpenStack components use to communicate control plane requests around an OpenStack deployment. RabbitMQ does provide a native Active/Active architecture; however this is not yet well supported so for the Raring/Grizzly cycle RabbitMQ is deployed in Active/Passive configuration using Pacemaker and Corosync with Ceph providing shared block storage.

NOTE: For 12.04, its worth running with the Quantal LTS kernel (3.5) to pickup improvements in the Ceph rbd kernel driver.

Configuration

The only additional configuration required by the RabbitMQ charm is a VIP and subnet mask which will be used as the access point for other services to access the RabbitMQ cluster:

rabbitmq-server:
  vip: '192.168.77.11'
  vip_cidr: 19

Deployment

The RabbitMQ charm is deployed in-conjunction with the HACluster subordinate charm:

juju deploy -n 2 local:rabbitmq-server rabbitmq-server
juju deploy local:hacluster rabbitmq-hacluster
juju add-relation rabbitmq-server ceph
juju add-relation rabbitmq-server rabbitmq-hacluster

RabbitMQ will be accessible using the vip provided during configuration.

OpenStack Services

Keystone

Overview

Keystone provides central authentication and authorization servers for all OpenStack services. Keystone is generally stateless; in the reference architecture it can be scaled horizontally - requests are load balanced across all available service units.

Configuration

The keystone charm requires basic configuration to be deployed in HA mode:

keystone:
  openstack-origin: 'cloud:precise-grizzly'
  admin-user: 'admin'
  admin-password: 'openstack'
  admin-token: 'ubuntutesting'
  vip: '192.168.77.1'
  vip_cidr: 19

user/password/token should be specific to your deployment; the VIP and subnet mask are in-line with other charms and will form the access point for keystone requests. Keystone requests will be load balanced across all available service units.

Deployment

The Keystone charm is deployed in-conjunction with the HACluster subordinate charm:

juju deploy -n 2 local:keystone
juju deploy local:hacluster keystone-hacluster
juju add-relation keystone keystone-hacluster
juju add-relation keystone mysql

BOOTNOTES

The keystone charm uses the stateless API HA model (see below). Some state is stored on local disk (specifically service usernames and passwords). These are synced between services units during hook execution using SSH + unison.

Cloud Controller

Overview

The Cloud Controller provides the API endpoints for Nova (Compute) and Quantum (Networking) services; The API's are stateless; in the reference architecture this service can be scaled horizontally with API requests load balanced across all available service units.

Configuration

The nova-cloud-controller charm has a large number of configuration options; in-line with other HA services, a VIP and subnet mask must be provided to host the API endpoints. In addition, configuration options for Quantum networking are also provided.

nova-cloud-controller:
  openstack-origin: 'cloud:precise-grizzly'
  vip: '192.168.77.2'
  vip_cidr: 19
  network-manager: 'Quantum'
  conf-ext-net: 'no'
  ext-net-cidr: '192.168.64.0/19'
  ext-net-gateway: '192.168.64.1'
  pool-floating-start: '192.168.90.1'
  pool-floating-end: '192.168.95.254'

Note that the conf-ext-net option is current disabled; unfortunately configuring this during service build proved a bit racey but the external (public) network can be configured post deployment of the charms:

juju set nova-cloud-controller conf-ext-net=yes

Deployment

The nova-cloud-controller charm is deployed in-conjunction with the HACluster subordinate charm:

juju deploy -n 2 local:nova-cloud-controller
juju deploy local:hacluster ncc-hacluster
juju add-relation nova-cloud-controller ncc-hacluster
juju add-relation nova-cloud-controller mysql
juju add-relation nova-cloud-controller keystone
juju add-relation nova-cloud-controller rabbitmq-server

BOOTNOTES

The nova-cloud-controller charm uses the stateless API HA model (see below).

Image Storage (Glance)

Overview

Glance provides multi-tenant image storage services for an OpenStack deployment; By default, Glance uses local storage to store uploaded images. The HA reference architecture uses Ceph in conjunction with Glance to provide highly-available object storage; the design relegates Glance to being a stateless API and image registry service.

Configuration

In-line with other OpenStack charms, Glance simply requires a VIP and subnet mask to host the Glance HA API endpoint:

glance:
  openstack-origin: 'cloud:precise-grizzly'
  vip: '192.168.77.4'
  vip_cidr: 19

Deployment

juju deploy -n 2 local:glance
juju deploy local:hacluster glance-hacluster
juju add-relation glance glance-hacluster
juju add-relation glance mysql
juju add-relation glance nova-cloud-controller
juju add-relation glance ceph
juju add-relation glance keystone

BOOTNOTES

The glance charm uses the stateless API HA model (see below).

Block Storage (Cinder)

Overview

Cinder provides block storage to tenant instances running with an OpenStack cloud. By default, Cinder uses local storage exposed via iSCSI which is inherently not highly-available. The HA reference architecture used Ceph in conjunction with Cinder to provide highly-available, massively scalable block storage for tenant instances. Ceph block devices are accessed directly from compute nodes; this design relegates Cinder to being a stateless API and storage allocation service.

Configuration

In-line with other OpenStack charms, Cinder requires a VIP and subnet mask to host the HA API endpoint. In addition, Cinder itself is explicitly configured not to use local block storage:

cinder:
  openstack-origin: 'cloud:precise-grizzly'
  block-device: 'None'
  vip: '192.168.77.3'
  vip_cidr: 19

Deployment

juju deploy -n 2 local:cinder
juju deploy local:hacluster cinder-hacluster
juju add-relation cinder cinder-hacluster
juju add-relation cinder mysql
juju add-relation cinder keystone
juju add-relation cinder nova-cloud-controller
juju add-relation cinder rabbitmq-server
juju add-relation cinder ceph
juju add-relation cinder glance

BOOTNOTES

The cinder charm uses the stateless API HA model (see below).

Networking (Quantum)

Overview

Quantum provides the virtualized network infrastructure within an OpenStack deployment. Currently its provided as an alternative to Nova networking as Quantum has not got feature parity yet. Quantum in a HA mode is only supported in Grizzly due to the provision of a agent/scheduler infrastructure in this release.

Some aspects of Quantum (the API server for example) are integrated into other OpenStack charms; to complete the networking topology a Quantum Gateway is required to provide Layer 3 network routing and DHCP services for Layer 2 networks.

Configuration

The quantum-gateway charm only requires configuration for the external network port that will be used for Layer 3 routing connectivity; This must not be the primary network interface on the server otherwise you will lose connectivity to the gateway server units!

quantum-gateway:
  openstack-origin: 'cloud:precise-grizzly'
  ext-port: 'eth1'

Deployment

juju deploy -n 2 local:quantum-gateway
juju add-relation quantum-gateway mysql
juju add-relation quantum-gateway rabbitmq-server
juju add-relation quantum-gateway nova-cloud-controller

BOOTNOTES

Quantum was due to have native HA support for Grizzly; however this feature did not land in full. Currently HA is implemented by re-allocating network resources on failed service unit to good service units; this is orchestrated using the cluster-relation-departed hook in the quantum-gateway charm. Fail-over of services can take between 10-30 seconds.

Compute (Nova)

Overview

Compute services are provided by Nova in an OpenStack deployment; specifically the nova-compute charm is used to deploy the required OpenStack services onto service units.

Full HA is not possible on Nova Compute service units; however the nova-compute charm can be configured to support secure live migration o f running instances between compute service units, supporting a managed, minimal disruption approach to underlying OS upgrades.

Configuration

nova-compute:
  openstack-origin: 'cloud:precise-grizzly'
  enable-live-migration: 'True'
  migration-auth-type: 'ssh'

Deployment

juju deploy -n 3 local:nova-compute
juju add-relation nova-compute nova-cloud-controller
juju add-relation nova-compute mysql
juju add-relation nova-compute rabbitmq-server
juju add-relation nova-compute glance
juju add-relation nova-compute ceph

BOOTNOTES

Live migration is facilitated using libvirt and qemu over a SSH connection. This includes block migration. A shared filesystem provided by Ceph was considered; however this approach is not truly scalable and CephFS does not have 'stable' status yet.

Object Storage (Swift)

Overview

The Swift service provides multi-tenant object storage within an OpenStack deployment. Its analogous with Amazons S3 service. Objects are distributed across underlying Swift storage nodes for both resilience and scalability.

Configuration

Swift is actually split into two charms; swift-proxy and swift-storage. For the HA reference architecture we configure Swift with three storage zones:

swift-proxy:
  openstack-origin: 'cloud:precise-grizzly'
  zone-assignment: 'manual'
  replicas: 3
  swift-hash: 'fdfef9d4-8b06-11e2-8ac0-531c923c8fae'
  vip: '192.168.77.12'
  vip_cidr: 19
swift-storage-z1:
  openstack-origin: 'cloud:precise-grizzly'
  zone: 1
  block-device: 'vdb'
swift-storage-z2:
  openstack-origin: 'cloud:precise-grizzly'
  zone: 2
  block-device: 'vdb'
swift-storage-z3:
  openstack-origin: 'cloud:precise-grizzly'
  zone: 3
  block-device: 'vdb'

In-line with other OpenStack charms, a VIP and subnet mask is provided to host the Swift HA API endpoint.

Deployment

juju deploy -n 2 local:swift-proxy
juju deploy local:swift-storage swift-storage-z1
juju deploy local:swift-storage swift-storage-z2
juju deploy local:swift-storage swift-storage-z3

BOOTNOTES

Need some notes about ring rebalancing and how the swift-proxy charm builds the rings without replicating data between nodes.

Dashboard (Horizon)

Overview

The Horizon service provides an end-user and administrator web portal with an OpenStack deployment. This service is completely stateless and can be scaled horizontally, with requests being load-balanced across all available service units.

Configuration

openstack-dashboard:
  openstack-origin: 'cloud:precise-grizzly'
  vip: '192.168.77.5'
  vip_cidr: 19

Deployment

juju deploy -n 2 local:openstack-dashboard
juju add-relation openstack-dashboard keystone

BOOTNOTES

Although this service is not an API service, it uses the same model for HA.

Access

Credentials

Keystone will always be listening on its VIP; source the following:

cat > novarc << EOF
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_TENANT_NAME=admin
export OS_AUTH_URL=http://192.168.77.1:5000/v2.0
export OS_REGION_NAME=RegionOne
alias nova="nova --no-cache"
EOF

Endpoints

Assuming you have deployed all services, keystone should provide an endpoint listing as detailed below:

keystone endpoint-list
+----------------------------------+-----------+--------------------------------------------------+--------------------------------------------------+---------------------------------------------+
|                id                |   region  |                    publicurl                     |                   internalurl                    |                   adminurl                  |
+----------------------------------+-----------+--------------------------------------------------+--------------------------------------------------+---------------------------------------------+
| 1ac5142878a34d0cb9e2290f23c916c6 | RegionOne |   http://192.168.77.2:8774/v1.1/$(tenant_id)s    |   http://192.168.77.2:8774/v1.1/$(tenant_id)s    | http://192.168.77.2:8774/v1.1/$(tenant_id)s |
| 3836f45f29bb46b0a6709338f9dfc720 | RegionOne |             http://192.168.77.2:3333             |             http://192.168.77.2:3333             |           http://192.168.77.2:3333          |
| 4526045cbada4a7fa388b5154c32a626 | RegionOne |    http://192.168.77.3:8776/v1/$(tenant_id)s     |    http://192.168.77.3:8776/v1/$(tenant_id)s     |  http://192.168.77.3:8776/v1/$(tenant_id)s  |
| 4cdbfb34997646c9abb552f03221d5be | RegionOne |             http://192.168.77.4:9292             |             http://192.168.77.4:9292             |           http://192.168.77.4:9292          |
| 6fef2877df7d4bc3a25ad04629c37abc | RegionOne |          http://192.168.77.1:5000/v2.0           |          http://192.168.77.1:5000/v2.0           |        http://192.168.77.1:35357/v2.0       |
| 9a1bad74efee4e5abfb4bce76847defb | RegionOne |     http://192.168.77.2:8773/services/Cloud      |     http://192.168.77.2:8773/services/Cloud      |   http://192.168.77.2:8773/services/Cloud   |
| b382813b93064c6796ba8d13e51d5902 | RegionOne |             http://192.168.77.2:9696             |             http://192.168.77.2:9696             |           http://192.168.77.2:9696          |
| f21918422c664a399a25483d67078c6a | RegionOne | https://192.168.77.12:8080/v1/AUTH_$(tenant_id)s | https://192.168.77.12:8080/v1/AUTH_$(tenant_id)s |          https://192.168.77.12:8080         |
+----------------------------------+-----------+--------------------------------------------------+--------------------------------------------------+---------------------------------------------+

Design

HA Models

Stateless API Server

For stateless API services, the OpenStack service is reconfigured to listen on [default port - 10], haproxy is installed and configured to listen on the default service port and to load balancer across all service units with the service and a Virtual IP is floated onto of the primary service unit.

This ensures that the full capacity of all service units in the service is used to service incoming API requests.

Leadership Election

Pre-clustering

Leaders are elected by selecting the older peer within a given service deployment. This service unit will undertake activities such as creating underlying databases, issuing username and passwords and configuring HA services prior to full clustering.

Post-clustering

Once a set of service units have been clustered using Corosync and Pacemaker, leader election is determine by which service unit holder the VIP through which the service is accessed. This service unit will then take ownership of singleton activity within the cluster.