ServerMonitoring

Summary

Release Note

Oneiric includes improved monitoring support integrated into Orchestra.

Rationale

If an Ubuntu Systems Administrator deploys a large number of Ubuntu Servers, they want to manage their monitoring effectively by automating the monitor setup.

User stories

* Corey is a systems administrator who deployed a 12 node Ubuntu Cloud and wants to know if the Nova API is not responding to requests and wants to be notified by SMS. * Nigel is a systems administrator who wants to know how much CPU usage a particular instance is using over time. Nigel will use the collectd-nagios plugin to monitor the usuage of the instance.

Assumptions

Design

You can have subsections that better describe specific parts of the issue.

Implementation

Nagios

The central component of the monitoring part of Orchestra is nagios. Nagios will be configured to trigger a restart everytime a new machine is added by orchestra.

Mcollective

To query the machines that are to be monitored is through an mcollective plugin. The plugin will query the server that is to be monitored and will write a nagios configuration file for the server.

Collectd

Collectd is a tool to collect information. There are two pieces to this part of implementation, make nagios-collectd work with mcollective and provide a small cgi program that will display the graphs to show the history of an interface.

BoF agenda and discussion

    Default right now is nagios, might change eventually.

    mcollective plugin talking to nagios

Action Items
[] Write an mcollective plugin for nagios: TODO
[] Change nagios to do the triggering: TODO
[] Check to see the services to monitor: TODO


CategorySpec

ServerTeam/Spec/ServerMonitoring (last edited 2011-06-10 01:00:17 by ip67-152-3-162)