DiskMonitoring

Problem

From the original mail to ubuntu-devel:

  • IDE disks die, a lot.
  • Sometimes they start getting read errors before they die completely.(a long while later they will die)
  • Joe User probably doesn't have backups.

Relevent threads on ubuntu-devel:

Solution Proposal

It appears as if S.M.A.R.T. is a reasonable solution and gives two opportunities:

  • Non-interactive monitoring: Monitor the drive and notify the user when there is a problem.
  • Interactive checking: Respond to a user query about drive health.

Non-interactive monitoring

Objective

Notify the user graphically when SMART detects a problem. This should be set up without user intervention on standard installs.

Proposed Method

The creation of a new package 'smart-notifier' which displays the output of the smartmontools package to the user. This package will work by:

  • Installing a notifier script which is triggered by smartmontools
  • The notifier script sends a message (probably via dbus or the filesystem) to a per-user-login-session daemon
  • This daemon then presents a graphical warning to the user

Completed Work

  • A prototype package is available at http://www.pakistanopensource.org/projects/cassandra/ along with a modified smartmontools package

  • The modifications to smartmontools have been discussed and refined with the Debian maintainer and are probably pending after sarge
  • Sponsorship of the smart-notifier package for Debian has been discussed with ToddTroxell

Future Work

  • Implementing the extensions discussed in http://udu.wiki.ubuntu.com/SMARTMonitoring either need to be implemented within smartd or will require a design radically more complex than the prototype. There should be some discussion about this.

  • Prototype code must be re-implemented using test driven development. (hooray for python doctests)
  • Implement spooling to store warnings if the user is offline (no good ideas how yet).
  • Security review of dbus configuration (involves communication between a root and user process)
  • KDE/Qt interface
  • I18N
  • Integrate into the default gnome/kde session (per-user-session daemon)
  • UI review + changes (GNOME HIG)

Interactive checking

A GUI app that uses smartctl as a backend, discussed in http://udu.wiki.ubuntu.com/SMARTMonitoring but probably simpler so that mere mortals can understand the GUI.

Issues with http://udu.wiki.ubuntu.com/SMARTMonitoring

  • Providing histroical data will probably mean the implementation of a daemon or will need to be implemented within smartd.
  • What are the advantages of "Captive" mode tests if exactly the same tests can be run in "non-captive" mode? The GUI will be simpler and there will be no need for checking which patitions are mounted.

Perhaps this can be done in the same package as the non-interactive monitoring.

From ToddTroxell Wed Mar 2 22:48:12 +0000 2005 From: Todd Troxell Date: Wed, 02 Mar 2005 22:48:12 +0000 Subject: thoughts Message-ID: <20050302224812+0000@www.ubuntulinux.org>

I think the biggest question is: do we want some generic "system alerts" applet/mechanism type of deal with a well defined way for applications to send alerts, or just a SMART error reporter? The simplest/easiest middle-ground is something that simply watches logs and reports on a list of pre-known regexes.

From ToddTroxell Wed Mar 2 22:59:45 +0000 2005 From: Todd Troxell Date: Wed, 02 Mar 2005 22:59:45 +0000 Subject: more thoughts Message-ID: <20050302225945+0000@https://www.ubuntulinux.org>

perhaps instead of log monitoring, we could use the -exec option in smartd.conf and make DBUS calls!

DiskMonitoring (last edited 2010-10-18 23:56:07 by mortar)