UnifiedSystemMonitoring
Launchpad Entry: none yet
Created: Date(2007-04-24)
Contributors: BogdanButnaru
Packages affected:
See also:
Work in progress!
Summary
This specification describes the way Ubuntu should gather and record run-time measurements of itself and the machine it's running on. It also describes a set of tools and libraries to facilitate displaying this information to the user. Think of it as extending top and System Monitor with a long-term history feature.
Rationale
Modern computers have many ways of measuring their run-time properties (e.g. temperature, battery charge, processor voltage and frequency). Most OS also allow measuring many of their parameters while running (e.g. resource usage and availability).
Most operating systems have tools that allow viewing some of these properties. The venerable program top is an example, as is the Gnome System Monitor or the Gnome Power Manager. Most of these tools have been written with a single purpose in mind; this is a noble thing in principle, but in this case it causes several problems.
The first problem is data availability. Most monitoring programs have been written to allow monitoring the instantaneous value of some parameters. This means most have no or very limited features for recording the data over long periods of time. In particular, a common problem is that monitoring is done only when requested, not continuously. (The Gnome Power Manager is a notable exception. However, it only remembers info until shut-down.)
The other problem is data accessibility: each program generates data separately, usually in it's own format, so it is difficult for users to view and even analyze it. In particular, tools like the Gnome System Monitor's resource usage graphs are only useful for rough estimates.
Bottom line: a single (modular) "monitoring daemon" could run continuously and accumulate information in a consistent way. The process can be managed with a unified interface, and a library with a few well-chosen utils and widgets would make tools like the Gnome System Monitor much more useful without huge efforts.
Use Cases
Bogdan is intermittently annoyed by sound stuttering, lack of responsiveness and other performance issues. However, by the time he opens a console and runs top, or he opens the System Monitor, the problem disappears. (These operations naturally take more time when the system is loaded.) However, he can just rewind the history a bit because the System Monitor has access to detailed logs of what happened during the last few hours.
George recently noticed the boot process takes longer, but he doesn't know exactly why. He installs bootchart, which has access to already recorded, medium-detail logs of the last few weeks, and uses bootchart's tools (customized versions of generic widgets) to investigate. He notices that his boot got slower by a 20 seconds on a certain date. He then uses Synaptic's new history panel to check what updates happened before that date, and quickly has a likely suspect.
Linus wants to know if the new IO scheduler added to the kernel is really better. He can easily use the performance logs from a large number of users (properly anonymized, gathered by a [http://popcon.ubuntu.com/ Popularity Contest]-like tool) from a couple of months before and after the update, and can run more meaningful statistics than on a limited test system.
On a whim, Dexter takes a look at his computer usage logs, and notices interesting patterns in the delays when he leaves his computer unattended. He gathers already-accumulated data from other willing users, writes his doctoral thesis on the subject, and develops a smarter algorithm for turning off the display and for locking the screen. The world energy usage lowers by a few hundred gigawatt. The global cost of security breaches is lowered by many millions of dollars. (Note: By this time, Ubuntu is the dominant OS, in no small part due to its excellent monitoring facilities. [https://bugs.beta.launchpad.net/ubuntu/+bug/1 Bug #1] has been closed for some time.)