HomeUserBackup

Revision 44 as of 2006-07-06 19:34:05

Clear message

Summary

This specification will discuss what's needed to take the the simple and concise backup application that was introduced in dapper, an make it a robust, finished solution that will be shipped in Edgy Eft. The system will use desktop notifications to make sure a user either backs up his data, or otherwise is aware of the consequences that may arise if he does not. The application described here can form the foundation for a more advanced and complete light dependency disaster recovery system.

Rationale

Providing an easy-to-use backup solution that's suitable for non-expert users is important. Expert users should install and/or use a more sophisticated backup system. Part of the aim is to tell the user exactly what she has to do, rather than leave it for her to think up a backup scheme, or take decisions herself about when/if to back up.

Use cases

  • John is a new Ubuntu user. He has been using his system for a week now, managed to sort everything out by means of getting his favorite theme set up and desktop behavior. He has also already got quite a few important email messages and some other bits of information currently stored on his Desktop. However, John is a newcomer to Ubuntu and is not aware of the fact he should do periodic backups. After using his machine for a week, a pop up dialog appears telling him "It has been a week since you installed your computer. In order to be able to restore it to the current state if data loss occurs, it's recommended that you do a backup. Would you like to do that now?". Upon confirmation, he is presented with the home-user-backup tool GUI, so that he can carry out the backup procedure.
  • Rita wants to refresh the backup set she had previously created. As instructed when creating the backup set, she reaches for the first backup CD, and inserts it to the CDROM drive. The system detects the inserted CD and pops up a filebrowser window with the file content of the CD. Rita recalls, that additional instructions while creating the backup CD set were to double click the backup file in order to restore or update backup data. She double clicks the file, and as a result the GUI of the home-user-backup program appears, offering her the option to restore her files, or to update the backup with new and changed files.
  • Marilize is an Ubuntu user who is worried about the safety of her data. Using her machine for 3 days now, she wishes to backup her data in order to be able to restore it in case it goes bad. She goes to "System" --> "Administration" --> "Backup Now". She is then instructed to insert CD media for storing the data backup. After confirming that she has inserted a CD into the drive, all her personal data is backed up to the removeable medium. When finished a pop up dialog instructs her "Please take out the CD, and label it 'Ubuntu Personal Backup data, dated 10-10-2006, 06:00am'". After confirming that dialog another dialog appears and notifies her that in order to restore or add changed and new files to the backup set, she'll have to insert the CD in drive, and then double click the file that will be listed in the file browser window, and that the program will instruct her from that point to carry on restoration or backup update procedure.

  • Dan has been doing differential backups for some time now. To make sure new work he has done today will get backed up, he inserts his last differential backup cd. The system detects that a cd has been inserted into drive, and opens a file browser window with the file content of the CD. There is only one file on the list, which is identified by the system to be a "home-user-backup archive file", Intuitively Dan double clicks the file, and as a result the GUI of the home-user-backup appears , offering Dan to either restore his files from the differential snapshot, or update it with new and changed files. Dan confirms an update, which in return makes home-user-backup scan for new and changed files, and add those to the differential backup CD. The program also makes sure there's enough free space on the medium to store the data. If not enough room is available, then home user backup system will suggest to Dan that it would probably be a good idea to make a new full backup now.
  • Norman just replaced his HD that went bad, and installed a new one. He wants to have his system settings and personal files restored. After re-installing his system on the new hard drive, he inserts the first backup CD into drive. The system recognizes that a CD has been inserted into the CDROM drive, and opens a file browser window listing the CD content, with the backup file on the list. After double clicking the backup file, Norman is presented with the home-user-backup program restore UI , which offers him to restore his files or update the backup. When Norman chooses to restore, the home-user-backup program starts to restore files, and when it consumes all data from the first backup CD it instructs Norman to take out the inserted CD and insert the next, and likewise for each of the subsequent volumes. The same process applies if Norman inserts a differential archive CD. At the end of the restore process his system is restored complete with all of his personal settings, bookmarks, and any home folder content prior to the HD loss.

Scope

This specification covers only backing up all the home directories on a given machine (i.e. data and settings files) by each owning user. Audiovisual content will be also backed up, unless it exceeds a non optical medium's capacity. As HomeUserBackup already supports multi-volume backups over CDs, audiovisual content will be backed up in that scenario by default.

The following are within the scope of this specification:

  • Backup to a user-specified path on a filesystem. (This could cater for non local medium backup if those are mounted and visible as part of the filesystem tree.)
  • Multi Volume Backups. (This already works with CDROMs in HomeUserBackup but needs testing and polish, specifically we need to re-implement in python or use nautilus-cd-burner's C implementation to detect remaining free space on a multi-session CD, and instruct a user to start a fresh full backup if his incremental backup CD has run out of space)

Other system changes this spec involves:

  • Desktop notification framework - there are newly created python bindings for libnotify, which should be synced from debian once their packaging is done. see - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=366863

  • SystemCleanUpTool (Low priority) - We should integrate this into the backup tool, such that before each backup we will call the system clean up tool to check if there is any redundent data that can be purged prior to the backup process.

Out of scope and planned future Work:

  • Data mirroring.
  • Doing backups to nonlocal media via network protocols. (eg. http, sftp etc.; In this context we need to consider possibility of using gnome-vfs.)
  • Backup scheduling algorithm. (For the next version; this will reduce user interaction with the program, which is good)
  • Backing up the list of installed packages.
  • Encryption.
  • System and all users home backup: The system backup should be a low hanging fruit if the normal backup is implemented: We will provide an option to backup all home folders, files in /etc that are different from the ones that are in the corresponding package and a list of manually installed packages (this aptitude function will be hopefully in apt soon). For restore we extract the config files and try to reinstall the packages that were installed on the old system.
    • Perhaps this can also be extended by a special backup profile for teachers in edubuntu.
  • Offer to restore or do incremental backup based on inserted backup CD; Inserting a previously created backup CD, in which case the system will identify this is a backup CD, it should offer the appropriate actions for the user to choose. (e.g. restore files or do an incremental backup using the inserted backup CD data as reference).

Design

The backup functions will be available from the system menu:

  • Applications->Accessories->Backup: Will allow a user to backup his personal files (e.g. /home/sivan).

The restore and incremental backup function is based on backup file mime-type detection. Moving the incremental backups and restore out of the main backup window allows us to tighten the workflow and to slim down the main window.

The restore and incremental backup functionality will be accessible to the user following a double click on a backup file in a file browser window. The appropriate actions for this backup file will be offered to the user; either restore files or do an incremental backup using the clicked backup archive as reference. (It's worth mentioning that this indirectly will cater for plugged USB drives, or inserted backup CDs. As the system pops up a file browser window with the folder content whenever such event occurs. We should take care to document this clearly so a user will have no doubts how to restore / refresh a backup he had previously created.)

Since a file inclusion/exclusion selector is out of scope for edgy we will provide a function to backup all files in home or to skip a predefined set of files: temporary files (browser caches, thumbnails), music (by file extension - mp3, ogg, etc.), videos (avi, wmv, wmf, etc). We will never backup the trash folder. A third option will be to back up only a specified folder. We will record this details as a "profile" of the backup, that will be saved as meta data file and be added to the first backup archive.

After the backup is complete the users need to be clearly and explicitly notified that when in the future they wish to restore the data, or to update the backup with new and changed files, they need to insert the CD or plug a USB drive , or just pick a folder of their choice containing backup files, and double click the first backup archive file.

If the user double clicks on a backup file, a dialog will be shown offering the various facilities that are available: cancel; update with new and changed files; verify the backup; restoring the contents of the backup.

The restore window shows the content of the archive in a tree view. Furthermore it allows the user to select and unselect files and to restore to a defined root folder.

Implementation

  • dar is used for the backup creation. (note: need to promote dar to main as this program is also meant to be included there)
  • cdrecord for burning the cdroms
    • MattZimmerman: isn't there a library for this? SivanGreenberg: Do you refer to nautilus-cd-burner's python lib for that? It's currently only usable for non multi sessions cd creation AFAICT, and it might be a good idea to not depend on it for sake of derivatives that don't ship this lib. If you have other suggestions, I'd be more then happy to learn about MattZimmerman: libburn

  • add a mime type handler for your backup file:
    1. For master (full) backup archive: *.bam

    2. For incremental (only changes) archive: *.bai

  • If we're still using dar the command line tool at the backend, we need to make sure we rename the file back to *.dar for dar's successful operation, as it automatically dropps the its own recognized extension and then follow slice numbers.

attachment:backup.png

attachment:file.png

attachment:progress.png

attachment:complete.png

attachment:handler.png

attachment:restore.png

attachment:conflict.png

attachment:backup-system.png

Code

attachment:hub-glatzor-UDSParis.glade

Data preservation and migration

  • Since dar file format remains mostly constant, and is always backward compatible, nothing to be done here.


CategorySpec