HomeUserBackup

Summary

This specification will discuss what's needed to take the the simple and concise backup application that was introduced in dapper, an make it a robust, finished solution that will be shipped in Edgy Eft. The system will use desktop notifications to make sure a user either backs up his data, or otherwise is aware of the consequences that may arise if he does not. The application described here can form the foundation for a more advanced and complete light dependency disaster recovery system.

an experimental first version: UbuntuHomeBackup

Rationale

Providing an easy-to-use backup solution that's suitable for non-expert users is important. Expert users should install and/or use a more sophisticated backup system. Part of the aim is to tell the user exactly what she has to do, rather than leave it for her to think up a backup scheme, or take decisions herself about when/if to back up.

Use cases

  • John is a new Ubuntu user. He has been using his system for a week now, managed to sort everything out by means of getting his favorite theme set up and desktop behavior. He has also already got quite a few important email messages and some other bits of information currently stored on his Desktop. However, John is a newcomer to Ubuntu and is not aware of the fact he should do periodic backups. After using his machine for a week, a pop up dialog appears telling him "It has been a week since you installed your computer. In order to be able to restore it to the current state if data loss occurs, it's recommended that you do a backup. Would you like to do that now?". Upon confirmation, he is presented with the home-user-backup tool GUI, so that he can carry out the backup procedure.
  • Rita wants to refresh the backup set she had previously created. As instructed when creating the backup set, she reaches for the first backup CD, and inserts it to the CDROM drive. The system detects the inserted CD and pops up a filebrowser window with the file content of the CD. Rita recalls, that additional instructions while creating the backup CD set were to double click the backup file in order to restore or update backup data. She double clicks the file, and as a result the GUI of the home-user-backup program appears, offering her the option to restore her files, or to update the backup with new and changed files.
  • Marilize is an Ubuntu user who is worried about the safety of her data. Using her machine for 3 days now, she wishes to backup her data in order to be able to restore it in case it goes bad. She goes to "System" --> "Administration" --> "Backup Now". She is then instructed to insert CD media for storing the data backup. After confirming that she has inserted a CD into the drive, all her personal data is backed up to the removeable medium. When finished a pop up dialog instructs her "Please take out the CD, and label it 'Ubuntu Personal Backup data, dated, 06:00am'". After confirming that dialog another dialog appears and notifies her that in order to restore or add changed and new files to the backup set, she'll have to insert the CD in drive, and then double click the file that will be listed in the file browser window, and that the program will instruct her from that point to carry on restoration or backup update procedure.

  • Dan has been doing differential backups for some time now. To make sure new work he has done today will get backed up, he inserts his last differential backup cd. The system detects that a cd has been inserted into drive, and opens a file browser window with the file content of the CD. There is only one file on the list, which is identified by the system to be a "home-user-backup archive file", Intuitively Dan double clicks the file, and as a result the GUI of the home-user-backup appears , offering Dan to either restore his files from the differential snapshot, or update it with new and changed files. Dan confirms an update, which in return makes home-user-backup scan for new and changed files, and add those to the differential backup CD. The program also makes sure there's enough free space on the medium to store the data. If not enough room is available, then home user backup system will suggest to Dan that it would probably be a good idea to make a new full backup now.
  • Norman just replaced his HD that went bad, and installed a new one. He wants to have his system settings and personal files restored. After re-installing his system on the new hard drive, he inserts the first backup CD into drive. The system recognizes that a CD has been inserted into the CDROM drive, and opens a file browser window listing the CD content, with the backup file on the list. After double clicking the backup file, Norman is presented with the home-user-backup program restore UI , which offers him to restore his files or update the backup. When Norman chooses to restore, the home-user-backup program starts to restore files, and when it consumes all data from the first backup CD it instructs Norman to take out the inserted CD and insert the next, and likewise for each of the subsequent volumes. The same process applies if Norman inserts a differential archive CD. At the end of the restore process his system is restored complete with all of his personal settings, bookmarks, and any home folder content prior to the HD loss.
  • Maria wants to backup all her personal data to a USB harddrive. But she does not want to backup certain directories like a temp folder where she keeps stuff that she intends to delete, a vm folder holding her vmware VirtualMachines, .wine because she was just playing around with it and does not want to waste space on her USB dirve. She selects directories to exclude simply by unselecting them on a directory view (optionally showing here all hidden dirs). She is happy eventhough the interface does only allow to deselect directories in one level because in most cases this is enough choice for her. And if she has a dir which she partly wants to backup, she simply deselects it in her general home folder backup and backs it up seperately later, having the same one level choice for that specific directory. For both backup variants she still enjoys the "don't backup temp, music and video data" option which is available for both the backup home dir and backup a specific dir variant.

Scope

This specification covers only backing up all the home directories on a given machine (i.e. data and settings files) by each owning user. Audiovisual content will be also backed up, unless it exceeds a non optical medium's capacity. As HomeUserBackup already supports multi-volume backups over CDs, audiovisual content will be backed up in that scenario by default.

The following are within the scope of this specification:

  • Backup to a user-specified path on a filesystem. (This could cater for non local medium backup if those are mounted and visible as part of the filesystem tree.)
  • Multi Volume Backups. (This already works with CDROMs in HomeUserBackup but needs testing and polish, specifically we need to re-implement in python or use nautilus-cd-burner's C implementation to detect remaining free space on a multi-session CD, and instruct a user to start a fresh full backup if his incremental backup CD has run out of space)

Other system changes this spec involves:

  • Desktop notification framework - there are newly created python bindings for libnotify, which should be synced from debian once their packaging is done. see - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=366863 DONE

  • SystemCleanUpTool (Low priority) - We should integrate this into the backup tool, such that before each backup we will call the system clean up tool to check if there is any redundant data that can be purged prior to the backup process.

Out of scope and planned future Work:

  • Data mirroring.
  • Doing backups to nonlocal media via network protocols. (eg. http, sftp etc.; In this context we need to consider possibility of using gnome-vfs.)
  • Backup scheduling algorithm. (For the next version; this will reduce user interaction with the program, which is good)
  • Backing up the list of installed packages.
  • Encryption.
  • System and all users home backup: The system backup should be a low hanging fruit if the normal backup is implemented: We will provide an option to backup all home folders, files in /etc that are different from the ones that are in the corresponding package and a list of manually installed packages (this aptitude function will be hopefully in apt soon). For restore we extract the config files and try to reinstall the packages that were installed on the old system.
    • Perhaps this can also be extended by a special backup profile for teachers in edubuntu.
  • Offer to restore or do incremental backup based on inserted backup CD; Inserting a previously created backup CD, in which case the system will identify this is a backup CD, it should offer the appropriate actions for the user to choose. (e.g. restore files or do an incremental backup using the inserted backup CD data as reference).
  • Automated backup to hotpluggable storage upon insertion. See http://ubuntuforums.org/showthread.php?t=412793

Design

The backup functions will be available from the system menu:

  • Applications->Accessories->Backup: Will allow a user to backup his personal files (e.g. /home/sivan).

The restore and incremental backup function is based on backup file mime-type detection. Moving the incremental backups and restore out of the main backup window allows us to tighten the workflow and to slim down the main window.

The restore and incremental backup functionality will be accessible to the user following a double click on a backup file in a file browser window. The appropriate actions for this backup file will be offered to the user; either restore files or do an incremental backup using the clicked backup archive as reference. (It's worth mentioning that this indirectly will cater for plugged USB drives, or inserted backup CDs. As the system pops up a file browser window with the folder content whenever such event occurs. We should take care to document this clearly so a user will have no doubts how to restore / refresh a backup he had previously created.)

Since a file inclusion/exclusion selector is out of scope for edgy we will provide a function to backup all files in home or to skip a predefined set of files: temporary files (browser caches, thumbnails), music (by file extension - mp3, ogg, etc.), videos (avi, wmv, wmf, etc). We will never backup the trash folder. A third option will be to back up only a specified folder. We will record this details as a "profile" of the backup, that will be saved as meta data file and be added to the first backup archive.

After the backup is complete the users need to be clearly and explicitly notified that when in the future they wish to restore the data, or to update the backup with new and changed files, they need to insert the CD or plug a USB drive , or just pick a folder of their choice containing backup files, and double click the first backup archive file.

If the user double clicks on a backup file, a dialog will be shown offering the various facilities that are available: cancel; update with new and changed files; verify the backup; restoring the contents of the backup.

The restore window shows the content of the archive in a tree view. Furthermore it allows the user to select and unselect files and to restore to a defined root folder.

Implementation

  • dar is used for the backup creation. (note: need to promote dar to main as this program is also meant to be included there)
  • libburn for burning the cdroms (python bindings - pyburn)
    • package pyburn
  • add a mime type handler for your backup file:
    1. For master (full) backup archive: *.bam

    2. For incremental (only changes) archive: *.bai

    3. This backup file will actually hold meta data regarding the dar files it accompanies, this way other programs that has registered the mime type will still be able to operate on huabackup's created backups, and this will also make it trivial for dar to seamlessly process those archives. (e.g. no archive inside wrapper format issue that will break slicing)

backup.png

file.png

progress.png

complete.png

handler.png

restore.png

conflict.png

backup-system.png

Code

hub-glatzor-UDSParis.glade

Data preservation and migration

  • Since dar file format remains mostly constant, and is always backward compatible, nothing to be done here.

Help Needed

Comments

  • RobertCarr 2006-11-15: Any thoughts on a transparent version based backup? Through the backup interface allow users to specify directories, and use inotify to watch files in the directory for modification and back them up to a hidden file with timestamp appended on modify. Also create an interface where users can easily "scroll back a day" and see what the folder looked like then. So far I have completed (as a proof of concept more than anything) a python based implementation of the backend, but no GUI components. Additionally allow the backup files to be stored on external medium such as a USB flash drive. This might merit a separate specification, but I thought I would check for views here before creating a new one. Similar in idea to Apples Time Machine in OSX Leopard

  • NeilGreenwood 2006-11-20: The order of the Use Cases should be looked at. E.g. the second Use Case (Rita) glosses over functionality that is better described in later Use Cases. I think the order should be John, Marilize, Dan, Rita, Norman, Maria.

  • StijnHoop 2006-11-20: The 'conflict' dialog misses out on an option: restore the backup, but keep the old file (ie. I want to see what changed in my document relative to the backup, or else I'm not 100% sure I really want to overwrite the "current" version). Of course this can be done by opening the file manager and moving the current version out of the way before clicking 'restore backup' but I think this is a pretty common case.

  • jimcooncat 2007-01-12: On the second dialog box titled "Choose a backup medium", please correct the grammar "Otherwise you could loose all your data..." to "Otherwise you could lose all your data". If you wear loose pants, you could lose them when you jump.
  • BryceHarrington 2007-01-17: While out of scope presently, a use case for future work would involve taking an older spare desktop machine, install ubuntu Edgy on it, and have it operate as a dedicated backup system. When the user's laptop connects to their home network, it resyncs to the backup server. The user can verify things are backed up at any time by simply logging into the desktop system and looking at recently changed files. Later the user's laptop is lost or stolen. The user orders a new laptop, and while waiting for it to arrive can use the spare desktop backup system temporarily. When the laptop arrives, the user is able to boot it with a Ubuntu Feisty live CD and quickly launch a restore task, that uses a software manifest from the backup server to assist in determining how to provision the new laptop, and then re-copies the user's data onto the laptop, restoring its configuration to a state nearly indistinguishable from the original laptop.

  • ErikVanLinstee 2008-01-15: It would be beneficial if instead of showing a conflict dialog for each single file, a summary dialog would be presented at the end of a restore. This way a restore would always continu even when unattended. The user can then resolve all the conflict at once, or might even be allowed to save them and resolve them later.

  • Alexjohnc3 2008-04-25: Although ErikVanLinstee's suggestion is preferable, it would be a nice to have an extra option that renamed the file instead of only giving you the option to delete the older version. For example, Sally is doing a backup on 2008-04-10 and when the program is about to backup abc.html that was updated on 2008-04-05 it will result in the same file from 2007-02-20 being overwritten, but Sally wants to keep that file too. Instead the program could give her the option to have her newer files automatically renamed. So abc.html from 2008-04-10 could be renamed abc-2.html or maybe abc-2008-04-10.html.

  • Petrb 2008-05-31: I have several suggestions to share. They are bit verbose - sorry for that.
    • "Choose a backup file" window: no backup file exist yet so user can not choose it, "Select where to save the backup" sounds better to me. There should be default for "Name", perhaps "george-ubuntubox2.2008-01-15.bam" (user, machine, date). "Always backup to a separate medium" - may be understood as "Do not use the same USB flash as the one you use for usual transfers/some other backup" - I suggest "Always backup to external medium".
    • "Creating new backup" - user may want to run more than one backup at time - display file name of backup. Unintentionally clicking the Cancel button destroys time invested in backing up so far (and perhaps leaves partially created archive?) - I suggest to replace "Cancel" button with "Pause". When user presses "Pause" then display "Backup creation is paused -Stop backuping now?" and buttons "Stop" and "Cancel".
    • "To restore or to add ..." - user's goal is to create backup. Therefore when backup creation finishes, the app should say "The backup somefile.bam is now completed" (user does not know that it is completed until the application says it) and the "To restore or to add ..." would serve as a hint on window's second line.
    • "Backup file love-me.dar" - show date when the backup file fas created
    • "Restore" dialog - we also could compare filesystem and backup contents and highlight the files that were changed since they were archived. The main thing we can ask user for decision before or immediately after restoration starts.
    • "There is a later version of...": I propose "File >blal.deb< has been changed since it was archived" - the later does not depend on understanding what "version" is and also explains how the conflict happened. And three radio buttons: a) "Do not restore the old file and keep the new (2007-08-01)", b) "Restore the old (2007-08-01) and delete the new", c) "Restore the old (2007-08-01) and rename the new to blal.deb.something" and the checkbox and OK button.

    • "System Backup" dialog: Replace "All home folders" -> "Home folders of all users" - for the later understanding of either "user" or "home folder" is enough. Also reminds the administrator that he is about to access data owned by users and therefore must be careful with the backup media.

  • Jamesishereto 2008-06-29: Another suggestion: When creating a new backup, the process finishes and the user is met with a complete progress bar and the text "Finished" underneath. The user then has two options, close with the window decoration X or click the "Cancel" button. This leaves confusion, will this cancel the completed backup or just close the window? Might be better to change the button text to Close|Exit or whatever when the backup is complete. Also, a red icon remains next to the "Verifying the backup" text which also leaves the user unsure whether the verification succeeded. This should probably also get a green tick on completion. Thanks.
  • watsoncj 2009-04-13: Just a note that the hurestore GUI in the repositories will not launch as of April 2009. Make sure you are aware of this before you start making backups with hubackup.
  • -- dylanmccall 2010-03-20 00:21:34: Déjà Dup has been doing awesome work lately, so I for one would love to see a backup implementation based off that. Firstly, and most importantly, it uses Duplicity for the actual process of backing up, instead of reinventing the wheel. This means it's easier to maintain and already battle-hardened. With that out of the way, there's still lots of room for cool features. The tool does incremental backups, it's easy and it supports encryption. The maintainer is also pretty serious about it becoming a Gnome module. Oh, it's written in Vala, too, which deserves many points! Where Déjà Dup is currently not perfect is that its user interface design prefers online backup, while casual home users probably don't... Technically, it does an awesome job backing up to disks (I've been using it for a good while now), but the user experience could be better. For example, right now I do backups to an external hard drive. When it is not attached, Déjà Dup's scheduled backup fails with a very angry error message. When I later attach the drive, it doesn't initiate the backup automatically, although it feels natural for that style of backup. (Granted, this is a tad more complicated than it looks; right now it uses Cron, and that does seem sensible). Finally, it gets confused when the mount point of my drive changes. It does record a lot of information about the location already (including UUID), so I suspect that's either a feature waiting to be implemented or my own fault.

    Sorry, this is an insanely long comment. My point is that there is now an AWESOME tool for this job, developed with longevity and common sense in mind. I would love to see talented designers and developers contribute directly to that instead of being redundant. So, please take a look Smile :)


This application is merely one of many ways to BackupYourSystem.

HomeUserBackup (last edited 2010-03-20 00:21:34 by dylanmccall)