Warning /!\ This is a WIP document


Summary

Oneiric will provide advanced power management features for servers and infrastructures.

Release Note

This section should include a paragraph describing the end-user impact of this change. It is meant to be included in the release notes of the first release in which it is implemented. (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)

It is mandatory.

Rationale

Ubuntu provides various power management software, dedicated to protection (NUT), management (NUT, fence-agents) or efficiency (PowerWake and PowerNap).

But there are several remaining issues:

In the below schema, imagine that PSU#1 is faulty. Now what if UPS #2 reaches low battery, and so cannot provide power anymore to the only valid PSU? The CLC will crash!

ServerOneiricInfraPower-examplePSU.png

Moreover, with Cloud systems being more and more deployed, infrastructure tools are needed to ease deployment and management of the power infrastructure. A complete visibility of the power-chain would also provide more advanced features.

The following illustration represent a small cloud infrastructure, made of 3 servers: 1 Cloud Controller (CLC) and 2 Node Controllers (NC).

Each server has 2 PSU, each PSU is connected to an outlet of a PDU, and finally each PDU is powered by an UPS.

ServerOneiricInfraPower-exampleRack.png

An example PowerChain for the CLC would be:

    PSU1 ==> PDU1:outlet1 ==> UPS1 ==> Main Power
    PSU2 ==> PDU2:outlet1 ==> UPS2 ==> Main Power

This specification defines the remaining tooling and missing pieces required to provide a complete infrastructure power management to Ubuntu systems, easily and by default.

User stories

Design

You can have subsections that better describe specific parts of the issue.

Devices discovery

NUT' build infrastructure allows for automated extraction of devices support information, directly from the drivers.

This currently allows to generate support files for USB systems like hotplug and hal (both obsolete), udev and UPower at distribution (ie 'make dist') time.

This mechanism will be extended to SNMP information, and will allow to generate new headers and files, to create:

The following devices and systems will be supported:

Note: local PowerNap instance may also be discovered and considered.

Avahi publication script

The base idea is to publish NUT services on the network, through mDNS, using Avahi.

This will allow:

There has been a discussion on the NUT mailing list.

The idea is to have avahi-publish started along upsd and/or upsmon, and publish NUT essential configuration information.

Configuration library and tools

This point is still under discussion: Augeas lenses for NUT, along with the various Augeas bindings and tools may be sufficient.

A high level Python wrapper may however be considered.

PowerChain design

The idea behind the PowerChain is simple: there are many links in a power-chain:

Having a consolidated view on the whole allows real HA and SmartGrid features.

This feature requires data acquisition from the above devices, and a powerchain-aware monitoring system.

Links will be single linked to its parent.

NUT PSU / native IPMI driver

As per the above, the interest for a NUT PSU driver is real.

To achieve this, the creation of a 'nut-psu' driver, using OpenIPMI, would for example expose the following NUT data:

device.mfr: DELL
device.mfr.date: 01/05/11 - 08:51:00
device.model: PWR SPLY,717W,RDNT            
device.part: 0RN442A01
device.serial: CN179721130031
device.type: psu
driver.name: nut-ipmipsu
driver.parameter.pollinterval: 2
driver.parameter.port: id2
driver.version: 2.6.1-3231M
driver.version.data: IPMI PSU driver
driver.version.internal: 0.05
input.current: 0.28
input.frequency.high: 63
input.frequency.low: 47
input.voltage: 242.00
input.voltage.maximum: 264
input.voltage.minimum: 90
ups.id: 2
ups.realpower.nominal: 717
ups.status: OL
ups.voltage: 12

Measurement data from the PSU are also be considered Commands, like on, off, reboot will also supported by NUT.

Improved fence-agents

NUT provides an automagic mechanism that allows to declare hardware support information, like USB VendorID:ProductID and SNMP ones, to be declared only once, in the NUT driver. These data are then extract, at dist time (ie 'make dist' operation, used by maintainers to generate .tar.gz or alike) to create various support files, for: HAL and hotplug (deprecated), udev, UPower and for the upcoming nut-scanner.

An interesting point is that NUT is already serving as a knowledge base for an external project (UPower).

Considering this, a generic fence-snmp-pdu could be created, using automatically extracted SNMP information from NUT drivers, to deal with the many NUT supported SNMP devices.

Creating a fence-nut agent would also allow to control UPS providing outlet group management.

Implementation

This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:

Devices discovery

This features is part of the NUT roadmap for 2.8.0.

Implementation has already started, and can be tracked on:

USB scan is mostly functional, along with local NUT devices detection. To test it:

$ svn co svn://svn.debian.org/nut/branches/nut-scanner
$ cd nut-scanner
$ ./autogen.sh
$ ./configure
$ cd tools/nut-scanner
$ make nut-scanner
$ ./nut-scanner
Scanning USB bus:
[nutdev1]
        driver=usbhid-ups
        port=auto
        vendorid=0x463
        productid=0x0001
        serial=AV2G3300L
Scanning XML/HTTP bus:
Scanning NUT bus (old connect method):
        xcp-usb@localhost
        hid-usb@localhost
        snmp1@localhost
        simu@localhost
        nmc-eaton@localhost
        powernap@localhost

Avahi publication script

The exact implementation will vary according to the init system used (sysV, upstart, systemd). For sysV, add a function to the NUT initscript(s),

Detailed implementation is still to be completed (list of published info, upstart/systemd implementation).

Configuration library and tool

Implementation of a Python module should be done in 'scripts/python/module/PyNUT.py'. The class should be named 'PyNUTClient'.

PowerChain implementation

Implementation details:

    device[:outlet][@hostname[:port]]

    Example:
      ups.conf
        [psu1]
            driver = nut-psu
            port = psu1
            parent = pdu1:outlet1@localhost
        [pdu1]
            driver = snmp-ups
            port = <ip address>
            parent = ups1@localhost
        [ups1]
            driver = usbhid-ups
            port = auto
            parent = main

      $ upsc psu1 device.parent
      pdu1:outlet1@localhost

      $ upsc -P localhost
      psu1 -> pdu1:outlet1 -> ups1

      $ upsc -Pl localhost
      ups1
       |-> pdu1:outlet1
            |-> pdu1

Improved fence-agents

Migration

Include:

Packaging

Packaging will happen in Debian, and will then be synchronized in Ubuntu.

The current short run TODO list is:

With the 2.8.0 NUT release (around september), the following things are also scheduled:

Test/Demo Plan

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.


CategorySpec

ServerOneiricInfraPower (last edited 2012-06-08 07:51:19 by 195)