SimpleBackupSolution

SimpleBackupSolution

Status

Introduction

A Simple Backup Solution for Ubuntu should comprise of snapshot functionality – to restore to the last known good configuration of a single file, directory or a directory tree.

Rationale

User expects from their favorite OS the possibility to perform some sane and simple backups of their system.

Scope and Use Cases

  • James Troup's laptop has been stolen, he wants his data back.
  • HELP! my laptop is on fire! I must restore my music files somewhere else!
  • Hey somebody on #ubuntu told me to run sudo rm / -rf and now nothing works.
  • Matt accidentaly deleted his quaterly report and wants to get it back as new as possible
  • Pete deleted some very important piece of text from the product description document on Tuesday, he wants to get a copy of this document from Monday's backup to copy that text into the current version

Implementation Plan

  • admin can configure a backup solution for the system
    • - Recommended config or Custom config, Size requirements estimation - Includes: directory selection ( defaults to /etc, /opt, /usr/local, /var )

      - Excludes: checkboxes for common dirs and file types + textbox for regexes + maxfilesize (default to exclude all media files (avi,mp3,wav,ogg,mpg) + all files >100 Mb, /var/backup) - Format: format selection and compression level selector (default: individual gzip level 9 or a tgz) - Destination: local dir, remote ssh session (default: /var/backup/) - Time: frequency (hourly, dayly, ...), timepoint, max number of days between two full backups (default: daily backups at 03:10 and 7 days between full backups)

  • admin can see how much disk space the backups will take
  • package list is backuped
  • if admin allows this, users backup preferences override admins instructions (this is postproned to later versions)
  • admin can write a snapshot of backup to CDs/DVDs with a GUI frontend that uses dar in backend
    • - there is option to write a full backup for any date or an increment from a given date

Note: incremental here means 'backup all files that have been changed (mtime) since the last backup'. diffs are not involved, as that would create a lot of complication for remote backup and lots of disc usage for the local ones.

Timeline

  • June 22 -- final specifications, data structures, Glade GUIs sumbitted for usability review
  • July 29 -- Initial version of backup worker ( http://koyanet.lv/~aigarius/sbackup-0.1.1.tar.gz )

  • August 5 -- Inital version of config and restore GUIs with UI modified according to the results of usability review
  • August 6-10 -- testing and bugfixing

Functional modules

  • backup backend worker
  • command line restore utility
  • administrator backup configuration capplet
  • user backup configuration override capplet (postproned to later versions)
  • user/admin GUI restore utility
  • admin tool to write a backup snapshot to CDs/DVDs and read it from such media

Backed worker was quite complicated and required creation of a block scheme and a definition of an internal data structure: backup definition tree.

Command line restore utility was quite trivial - no further detalisation. Note: It should be written so that it is usable as a module by GUI restore utility.

Administrator and user backup configuration caplets give me a bit of trouble with the creation of all neccessery UI elements in Glade. "Backup now" button is here (with a progress bar). There is much functional similarty similarity between these caplets and that lets me belive that a merge is possible here.

The GUI restore utility is mostly trivial, except for the need to give an ability to select multiple files/directories where these directories might or might not exist in the current directory tree. It seams that use of a generic tree structure will be required here.

Format of the backup target directory

/var/backup - base directory

/var/backup/.tree.cache - cache of the file structure of all backups (updated by cron jobs after all regular backups)

/var/backup/20050723.172354.aigarhome.ful/ - a full backup snapshot from 23rd July 2005

/var/backup/20050723.172354.aigarhome.ful/ver - backup directory version information (default - 1). mtime of this file is the definitive start time of the backup.

/var/backup/20050723.172354.aigarhome.ful/packages - dump of 'dpkg --get-selections'

/var/backup/20050723.172354.aigarhome.ful/tree - file structure of this backup (includes name, size, permission and mtime information)

/var/backup/20050723.172354.aigarhome.ful/excludes - structure describing all excluded files (paths, regexes, maxsize)

/var/backup/20050723.172354.aigarhome.ful/files/home/aigarius/soc/simple_backup_spec.txt.gz - recreation of the filesystem structure for the backuped files. all files are .gz'ipped individually. permissions, ctime, mtime and atime are maintained. symlinks are copied as such.

/var/backup/20050723.172354.aigarhome.ful/files.tar - if target filesystem cann't maintain UNIX file information, .tar of the files/ subdirectory is to be used. This is a security risk as all users can get any file from this, so this must be used *only* when filesystem does not enforce proper permission control anyway.

/var/backup/20050723.172354.aigarhome.ful/stats - statistics

/var/backup/20050724.070502.aigarhome.inc/ - an incremental snapshot from 24th July 2005

/var/backup/20050724.070502.aigarhome.inc/ver - same as above

/var/backup/20050724.070502.aigarhome.inc/base - symlink to the previose (base) directory. can be *.ful/ or *.inc/

/var/backup/20050724.070502.aigarhome.inc/packages.patch - 'diff -u' of the packages list

/var/backup/20050724.070502.aigarhome.inc/tree - describes added and changed files.

/var/backup/20050724.070502.aigarhome.inc/removed - list of removed files

/var/backup/20050724.070502.aigarhome.inc/excludes - same as above

/var/backup/20050724.070502.aigarhome.inc/files/* - same as above

/var/backup/20050724.070502.aigarhome.inc/files.tar - same as above

/var/backup/20050724.070502.aigarhome.inc/stats - same as above

Sequence of a backup run

  • Check that no another backup process ir running - quit if concurrence found
  • Load basic configuration
  • Check writability of the target directory - quit if failed
  • Test if target directory can enforce permissions (make $must_tar=0) or not (make $must_tar=1)
  • Load the configuration of this backup run
  • Decide whether to do a full or an incremental backup
  • Create target directory for this snapshot, create '.../ver' file
  • If incremental, find a base directory and create '.../base' symlink
  • Initiate backup tree structure - root element is ["/", 0]. First element is the path, second shows whether to backup this branch (1) or no (0) or if it is a internal node (-1).
  • Create the backup tree structure from the config - whenever a subdirectory is included or excluded, parent of this directory is recursevely expanded, so that the tree would contain all significant nodes. all newly created nodes inherit properties from the parent. List of paths must be sorted alphabetically before processing so that including of a parent does not override previoselly defined exclusion of the child.
  • Mark backup target directory as excluded from the tree (of it is a local directory)
  • Write ".../excludes"
  • Backup package selections
  • if $must_tar then open the .tar file
  • Foreach leaf of the backup tree structure (leafs have second fiels set to 0 or 1) that needs to be backuped
    • - Call do_backup( $path ). - do_backup will return a list of all backuped files or a patchlist. Append that to ".../tree" or ".../tree.patch"
  • if $must_tar then close .tar file
  • Write ".../stats"

do_backup()

  • Uses global variables: $target, $increment_timestamp, @excludes and $must_tar
  • get list of all files and directories in the $path (that have mtime bigger then $increment_timestamp)
  • if $increment_timestamp then get list of all removed files and directories in the $path since $increment_timestamp
  • apply @excludes to remove unneeded files and directories from both lists
  • For each directory in the list
    • - create the directory in the $target (or .tar)
  • For each file in the list
    • - compress the file - write it to the $target directory (or .tar)
  • return list of backuped files and directories

Notes:

  • + in later versions, when user initiated backups will be possible, permission checking will be needed in do_backup(), but still the backup will have to be run from root user, so that target directory can be properly created.

Automatic full/incremental backuping

The backup process will decide whether to do a full or an incremental backup basing only on contents of the target backup directory and maxincrement configuration option. This will work like this:

  • if no previouse backups are found -> must do a full backup

  • if the last backup was a full backup
    • - if the backup is more then maxincrement days old -> must do a full backup - otherwise we can do an incremental backup from that

  • if the last backup was an incremental backup
    • - search for the last full backup:
      • if last full backup is more then maxincrement days old -> must do a full backup

      • if the last full backup is less then maxincrement days old -> can do an incremental backup basing from the last incremental backup

      • if no full backup is found -> must do a full backup

Data Preservation and Migration

None

Packages Affected

None

User Interface Requirements

Outstanding Issues

  • Config applet is lacking a GUI to construct regularity rules for the cron jobs
  • dar cmd-line is complex.
  • During the BOF session, discussed system-level snapshoting as being the only way that could allow rollback and keep a consistent system, meaning at the whole filesystem is in sync with the contents of /etc. Looked at dm-snapshot target. If this can be made to work without wasting excessive amounts of space (eg, too many extra partitions) then this would be the ultimate foo, meaning that somebody could upgrade to Hoary, click 'rollback' and have their system come back exactly as it was 24hours early. This would require alot of investigate but maybe something is worth following up as a unique selling point.

UDU BOF Agenda

UDU Pre-Work

UbuntuDownUnder/BOFs/SimpleBackupSolution (last edited 2008-08-06 16:32:56 by localhost)