ServerMonitoring
Launchpad Entry: server-o-improve-monitoring
Created: 2011-05-20
Contributors: ChuckShort
Packages affected: nagios, mcollective, collectd
Summary
Release Note
Oneiric includes improved monitoring support integrated into Orchestra.
Rationale
If an Ubuntu Systems Administrator deploys a large number of Ubuntu Servers, they want to manage their monitoring effectively by automating the monitor setup.
User stories
* Corey is a systems administrator who deployed a 12 node Ubuntu Cloud and wants to know if the Nova API is not responding to requests and wants to be notified by SMS. * Nigel is a systems administrator who wants to know how much CPU usage a particular instance is using over time. Nigel will use the collectd-nagios plugin to monitor the usuage of the instance.
Assumptions
Design
You can have subsections that better describe specific parts of the issue.
Implementation
Nagios
The central component of the monitoring part of Orchestra is nagios. Nagios will be configured to trigger a restart everytime a new machine is added by orchestra.
Mcollective
To query the machines that are to be monitored is through an mcollective plugin. The plugin will query the server that is to be monitored and will write a nagios configuration file for the server.
Collectd
Collectd is a tool to collect information. There are two pieces to this part of implementation, make nagios-collectd work with mcollective and provide a small cgi program that will display the graphs to show the history of an interface.
BoF agenda and discussion
Default right now is nagios, might change eventually. mcollective plugin talking to nagios Action Items [] Write an mcollective plugin for nagios: TODO [] Change nagios to do the triggering: TODO [] Check to see the services to monitor: TODO
ServerTeam/Spec/ServerMonitoring (last edited 2011-06-10 01:00:17 by ip67-152-3-162)