SystemCleanUpTool

Differences between revisions 46 and 47
Revision 46 as of 2006-07-06 20:04:13
Size: 12095
Editor: 85
Comment:
Revision 47 as of 2006-07-06 20:24:31
Size: 12234
Editor: 85
Comment: more editing, clarifiying some more items in design
Deletions are marked like this. Additions are marked like this.
Line 36: Line 36:
   1. Dotfiles that were introduced by a program that created them but the creating program is no longer installed (e.g. gaim)
Line 73: Line 72:
 1. When a new kernel is being installed by the packaging tools, the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation. We should try to make the callback intelligent and be able to detect weather it can use X GUI, or a text UI interface to cater for people using this tool on system that do not have X/GNOME installed.
 1. Then, the system clean up tool will mark the packages of:
 1. When a new kernel is being installed by the high level packaging tools (apt, syanptic) , the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation. We should try to make the callback intelligent and be able to detect weather it can use X GUI, or a text UI interface to cater for people using this tool on system that do not have X/GNOME installed.
 1. If the system clean up tool package is not installed, then instead of the actual executable script we should have a dummy executable script that will return in success, making system clean up tool work only if installed.
 1. If installed, t
hen, the system clean up tool will mark the packages of:

Summary

This specification discusses implementing a tool for suggesting to a user several ways to keep his system from getting too full, cluttered and confusing to use over time. This tool will attempt to require as little intervention as possible by the user. This should result in a running system always kept tidy, easy and enjoyable to use.

Rationale

In due course, a once fresh system can become cluttered with all sorts of residual content, such as left over dotfiles that are no longer used cluttering home directories, too many installed kernels, package dependencies that are no longer needed (since their dependant is no longer installed) or log files taking up precious disk space. These tend to accumulate over time, confusing the user, or even eventually leading to the computer becoming unusable, forcing users to actively put effort into cleaning it up. There should be a solution to warn users beforehand, and suggest and perform purge operations for common leftover cruft.

Use cases

  • George has been receiving lots of audio visual content recently, from relatives overseas. He has been burning these to DVDs, and put each file in the garbage bin after writing it to DVD. After a while, the free space in his home filesystem has dropped to the minimum allowed, but he does not notice. He is also downloading a big ISO image of the edgy desktop CD for testing. The system clean up tool detects that there is not enough free space, and pops up a desktop notification for George, suggesting that there is large amount of data in .Trash that can be purged to make more room. George acknowledges; space is freed and the download is saved.
  • High Priority: John is a Dapper user. A recent kernel upgrade has been released to cater for a security bug. After finishing to install the new kernel, and before rebooting to use it, the system informs him that he has some left over kernel packages that could be removed and asks him if he wants to do so. If he acknowledges, all of the kernels that he no longer needs are removed, leaving him with a clean /boot with only the currently running kernel and the newly installed one.

  • Brian is a Launchpad developer. He has several Zope instances installed for developing Launchpad, and runs the PostgreSQL database server. These applications produce a lot of log file data, especially when used for heavy development and experimentation; in this case they are usually set to maximum verbosity for debugging purposes. After a week of heavy work, his free space on his root filesystem reaches the low minimum. The system clean up tool detects the problem, and before Brian is running into operational problems, it offers to delete some old logs and some files in /tmp that occupy most of the currently used space. He confirms the removal, and resumes his work. The wizard takes care to do the removal in the background, first hunting for the biggest files, in order to keep the system operational and make more space in the shortest possible time. The interaction with Brian is done through the desktop notification infrastructure.

Scope

  • Kernel left overs:
    1. Due to security upgrades.
    2. General bug fixes and version upgrades.
    3. Make sure never to touch a kernel package created by the user.
      • -- How can we know if it's such a package?
  • Residual packaging related content:
    1. Conffiles.
    2. Init scripts.
  • Packaging system leftovers:
    1. Contents of /var/cache/apt/archives.
    2. Orphaned files.
  • General left over content:
    1. Browser caches.
    2. Aged audiovisual content.
    3. Aged and/or large log files.
    4. Orphaned dotfiles.
    5. Large ISO files.
    6. Content of /tmp
    7. Content of /var/log

Design

Aging: The time period that had passed between the last time a file has been accessed, and the recorded time reference point. The reference point can be the current date and time reported by the system, or an eariler time in order to enable more accurate aging calculation when a system hasn't been used over a long period of time.

1.Dealing with general left over content:

A weighing algorithm needs to be developed to enable the tool to identify targets of opportunity. The following factors need to be taken in consideration in producing the weight result per file:

  1. A relative time reference point should be used for measuring the aging time of all files. This is in order to overcome the "vacation problem" where a user hasn't been using his system for a long period of time, and by using the current time when he first login after his vacation, the weighing would get distorted to include files that the user accessed just before he went on vacation. This means that we need to measure the actual usage time. To do so, we will record the last access time of files that are accessed every login (for example, gdm files) and use this_login-1 's time stamp as our new reference point.
  2. How much time passed since last access time of the file.
  3. MIME / file type.
  4. Size of file
  5. Capacity of the holding volume or the user set quota.
  6. The tool should automatically ignore specific ~/.?* directories to not break applications in ubuntu-desktop or there dependant packages. This should be implemented using a blacklist to exclude potentially problematic files that should be never removed. This should be easy to configure by the user themselves, in case a power user wants to add more files to the blacklist to prevent their removal permanently.
  7. In order to not affect system performance too obtrusively, consideration should be made to have the aging measurement code to the updatedb periodical process. It already affects system performance to a great deal when it runs, but is still supported in Ubuntu. Either as a stand alone approach or combined with the previous one, we should take care to keep the calculation and scanning process held, until the system becomes idle and build it such that it does its processing in incremental chunks. E.g., progress each time the system is idle a bit more until covering all files / folders in the designated file system for clean up for clean up. We should also make sure to use the fastest system call to receive the file data we need for aging and oppurtunity measurement. If that can be only done in C, then we'd rather code it in C and have python bindings to access it.

2.Package left over house keeping:

  1. Offer to remove orphaned files that no longer belong to any of the packages installed on the system. Certain system configuration files created during installation that are not to be removed should be also automtically detected and added to the blacklist.
  2. Offer to remove packages that are rarely or not used anymore, and consume substantial amount of disk space.

3.Unused left over kernels:

  1. When a new kernel is being installed by the high level packaging tools (apt, syanptic) , the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation. We should try to make the callback intelligent and be able to detect weather it can use X GUI, or a text UI interface to cater for people using this tool on system that do not have X/GNOME installed.
  2. If the system clean up tool package is not installed, then instead of the actual executable script we should have a dummy executable script that will return in success, making system clean up tool work only if installed.
  3. If installed, then, the system clean up tool will mark the packages of:
    1. The currently running kernel. This kenrel is used as a reference point, as we will be marking for removal older kernels that were installed previously excluding those installed manually (e.g. using dpkg -i ..) , the reference point, and the newly installed kernel version.
    2. If the current running kernel was infact installed manually, then this logic still holds.
  4. Checking if the user has any other kernels installed other then those detected for keeping in the previous items.
  5. If he does not, do nothing.
  6. If he does have, gather a list of all those kernel packages.
  7. Pop up a desktop notification to the user: "You have unused kernels installed on the system. Would you like me to clean them up?".
  8. If the user confirms, then present a dialog displaying the list of kernels that was constructed in steps 2-4 in a window that will contain a columned table, each row representing a kernel package:
    • Row 1: Kernel Version. (e.g. "2.6.15-25-686")
    • Row 2: Kernel Package Name. (e.g "linux-image-2.6.15-25-686").
    • Row 3: A check box indicating if this kernel package is to be removed, or left installed. (checked->keep, unchecked->remove)

  9. All items in the list are by default unchecked (meaning that the tool will remove all kernels on the remove-list)
  10. The user can choose to keep any of the kernels on the list by checking the checkbox next to the name/version.
  11. Pressing "Commit" will calculate the kernels packages to be removed from the check list, remove the kernel packages, and notify the user of success, or error if any issues were encountered while removing.
  12. Left over kernel removals should be also proposed when a user is running out of sufficient free space in his /boot fs or in his / while /boot is part of it. (This should be the last priority if /boot and / are on the same fs; we should first check for other bigger files that can be removed.)

4.Modes of operation:

  1. When being executed as an unprivileged user, the tool should touch only the running user's home directory.
  2. When being executed as a privileged user (sudoed), the tool should care about system wide cleanup (kernel, system folders, etc.).

5.Catching disk space events:

  1. The tool should use gnome-volume-manager to catch for low disk space events. If g-v-m doesn't allow the flexiability to set notifications to be dispatched acoording to user's quota, or custom set minimum free space - we should implement our own daemon for monitoring the amount of free space on a given file system.

6. User Interface:

  1. After gathering all required information, and building the opportunity list, the tool will present a table view to the user consisting of all the files it has identified for removal, allowing the user to confirm or reject the suggestion which respectively will result in those files being removed or the files being added to the whitelist so they will not come up again on the opportunity list.

    ScottJamesRemnant: I was surprised, given the end-user focus of this tool, that the UI is so under-specified. Could this section be expanded somewhat? In particular

    • A list of files seems somewhat harsh; I would have expected to see something more along the lines of "31MB of historical log files, last accessed: NEVER" rather than a dump of /var/log.
    • Whitelisting by default seems wrong, why not allow the choice whether to "not do those now" and move onto another opportunity or "never delete these".

Implementation

  • PackageDependencyManagement will be used to achieve design item #2.

  • The weight of a file for the opportunity calculation will be weight = file size + aging factor.

  • Kernel clean will use packaging interface to remove the old kernels, a bash or python script will be used to record the current running kernel and the one just newly installed.
  • PyGTK and Glade will be used for the UI development.

  • Desktop notification framework will be used to deliver the first interaction with user prior to launching the clean up application (we will have to replace / patch the current low disk space notification available from gnome-vfs)


CategorySpec

SystemCleanUpTool (last edited 2008-08-06 16:16:56 by localhost)