VersionControlledEtc

Summary

The specification describes an enhancement to keep automated versioned history of files under /etc.

Rationale

Over time, configuration changes occur and it is easy to forget why changes were made, or what was in a config file prior to the change. It becomes even more difficult when there are multiple administrators working on a machine over a period of time.

Use cases

  1. Andrew is a systems administrator and his mail server has stopped working. Richard, another administrator, made some changes a week ago but has now gone on holiday to his residence in France. Andrew is unsure what changes Richard made to the mail configuration, but he can use the automated version history to look at the changes made before Richard went away.
  2. Karl has a box which was working and set up the way he likes. After a failed dist-upgrade his system is left in a half configured state. He would find it benefical to be able to roll back to just before the upgrade.

  3. Igor has a few servers upon which he has made identical bespoke modifications to /etc. He would like to be able to store these modifications centrally and then distribute them to his servers when he changes them.

Scope

Design

Implementation

So far I have tried out the following;

in /etc/;

bzr init
bzr add *
bzr commit -m "Initial Import"

then I added the following to roots crontab;

0 * * * * cd /etc; bzr commit -v -m "Automated update at `date`" >& /dev/null

Code

Some proof of concept code is avaialable via bzr from http://www.cs1ajb.staff.shef.ac.uk/bzr/auto-bzr/. You can download it using bzr using

bzr branch http://www.cs1ajb.staff.shef.ac.uk/bzr/auto-bzr/

Data preservation and migration

Outstanding issues

  • Depending on implementation, files modified by a user could fail to be updated in VCS.
  • Automating this process at the dpkg level would be highly intrusive.
  • is bzr the right choice? I think it is, but what do I know?!
  • Is it useful to get dpkg to update the bzr tree when it does things in /etc?
  • Can we use DPkg::Pre-Invoke and DPkg::Post-Invoke in apt to integrate with package management better?

BoF agenda and discussion

  • I'd say it's very useful to integrate this with dpkg, so that items changed due to package upgrades are shown as being done by a different user in the logs.
  • It would be useful to make this tool as generic as possible - such that other directories can be configured to be tracked and even results of applications, such as dpkg --get-selections

Comments

AchimBohnet:

  • poluting /etc/ with version info files/dirs is a no-no IMHO. FWIW, I use svk because it does not add .<whatever> to every dir like svn (I have not tried bzr yet)

    • JelmerVernooij: bzr only requires a control directory in the root of the working tree, so it would mean only adding /etc/.bzr

  • Pkg tracking _including version_ is important. dpkg --get-selections does contain this info

  • support for at least 1 commit per apt-get install/(dist-)upgrade is IMO very important (ala apt-listchanges), One commit for each installed or upgraded pkg level would be nice to have. At least one has to check that before pkg changes /etc is in a clean state so user and pkg changes don't mix.
    • AndrewBeresford: a) I rewrote above item. Better? b) I even don't like one .bzr per managed tree. find and locate don't like the duplicates too (at least in the svn case Wink ;) That's of course IMHO.

  • Unfortunately there are still automaticly modified files /etc, e.g., /etc/adjtime and /etc/mtab. IMHO such files need to be ignored and not put under version control. So a mechanism for igore-files right from start is needed.

AndrewBeresford:

  • AchimBohnet, I agree with your comment about polluting /etc. bzr does place files in a .bzr directory, but only in the top level of the repository. That seems a reasonable trade off to me.

  • dpkg --get-selections doesn't seem to contain any version numbers when I run it. I'm not sure that using version management to log the output of dpkg --get-selections is the based way to do things anyway. Would it not be better to get dpkg to log its actions to a file (/var/log/dpkg.log?).

  • I'm not sure I'm clear on your idea about 1 commit per multiple installed/removed package. What I'd thought up was running the bzr commit at the very end of a dpkg action and logging with a commit message like "<name of package> <version> installed". You could also run the bzr commit at the beginning of dpkg, but my concern would be the performance decrease. Does dpkg have a mechanism for running arbitrary commands at the beginning/end of a session? Something, like a global preinst/postinst script that gets run for every package installed/removed? Update: This can be done through apt hooks - not quite as good as going through dpkg, but much more straightforward/less invasive than modifying dpkg to do it.

PaulSchulz:

  • GIT can also be used as the repository management tool. Using packages 'git-core', 'git-doc' and 'gitk'.

Eg. In /etc/, using /var/git/etc.git to hold the repository.

mkdir /var/git
GIT_DIR=/var/git/etc.git git init-db
GIT_DIR=/var/git/etc.git git add .
GIT_DIR=/var/git/etc.git git commit -m "Initial Import"
  • Other features:
    • Only 'selected' files are tracked for changes.
    • gitk can be used to quickly search for changes.

    • Commits and Tags can be signed.
    • Entire directory state can be stored and restored (Igor), and branching also supported, eg. Igor has to support multiple machines which generally are the same but have a few minor differences.
    • Could be used to test configurations in 'chroot' cages or virtual machines.

Huerlisi (Simon Huerlimann):

  • We're working on a set of scripts to do exactly this: managing the configuration files in /etc using a SCM. We settled on Git/Cogito to do the main work. The tool is called IsiSetup is just now getting a home: http://www.isisetup.ch/

    [update] I've implemented the dpkg integration some of you mention. IsiSetup now creates a package branch using the package configuration when apt-get installing a package. This configuration is then fetched into /etc.

    I'm blogging about some of the concepts and implementation of IsiSetup at http://huerlisi.wordpress.com. The website for IsiSetup currently gets some love, but is still incomplete and links are broken as we move documents from our internal wiki to this site. Please mail me at simon.huerlimann@logintas.ch if you'd like to know more or have some suggestions:-)

StefanMichalowski:

  • could inotify be used to monitor changes on files in /etc for those cases where the user or some application modifies a file?
  • could dbus be used to send a message when a file is modified in /etc, which could then be captured and used to create a commit?
  • instead of using svn,cvs,bzr or any other version control system, would it be possible to mount /etc on a versioned-fs, such as one of the following fuse implementations:

TobiasHunger: Are you aware of etckeeper? That is a set of scripts that does exactly what you are describing. The debs are severly outdated in ubuntu thought.

What does etckeeper do? It integrates into apts dpkg pre and post hooks. That is fine for me as hardly ever use dpkg manually anyway. It does version file permissions, users and groups, etc. by using metastore (a tool which exports the additional data into a file which is then put under version control). Currently it does support mercurial and git for its backend, but it should be trivial to add your favourite VCS as a backend.

I've just hacked up a bzr backend for etckeeper - see http://gitweb.samba.org/?p=jelmer/etckeeper.git;a=summary -- JelmerVernooij

etckeeper, with bzr support, is available in Ubuntu Hardy -- DanielHahler

Comments moved from ServerCandy specification

Merge updated package configuration

Does this advance also include other improvements over Debian's conffile handling? Right now, it only guarantees two versions (the old installed version and and the new package version) are available to examine, and leaves many openings for screw-ups in configuration. A better way would seem to be a three-way merge from the packaged version that the running version in /etc descended from. Basically, I see some nice ideas about keeping better track of /etc, that a lot of sysadmins do on their own, but nothing about integration with the package management system, which is where the real opportunity for advance lies

Decision to choose bzr

What was the reason to choose bzr above other source control systems?

Was the main reason that bzr has the nice feature that there's only a single .bzr in the top level directory?

I suggest that the Svk project be considered, as it has the same feature. However, svk sits above Subversion, which has proved itself as a reliable source control system and many server admins are familair with it.

It seems a little odd to trust a relatively brand new source control system on an OS that's going to be supported for 5 years. I would rather see something that people are familiar with, even if it doesn't have all of bzr's sexy features.

I see several reasons to choose bzr over svk:

  • Bazaar is used inside Ubuntu already
  • It has a UI similar to that of Subversion, at least similar enough as svk.
  • It is in Ubuntu's main repository so it has a higher degree of support than svk already (svk is in universe).
  • svk has a long list of dependencies, most of which aren't installed on ubuntu by default (at least 26 packages that aren't part of a default install). bzr depends on just python and python-celementtree
  • The fact that svk is built on top of Subversion also doesn't necessarily make it as stable as Subversion. bzr has an extensive testsuite and is more popular than svk, at least according to popcon (9953 installs of bzr, 708 of svk)

bzr requirements

(We should probably create a separate spec for the bzr implications)

I'd like to clarify a bit more just what the use cases for this are. Is it just to keep a record of previous configurations and to allow admins to manually diff and roll back?

I think the bzr side of this is practical for dapper -- indeed I might start tracking my /etc in bzr just to see how it goes.

bzr 0.6.2 handles versioning of files, directories and symlinks. On my breezy machine there are no other file types in /etc; if we wanted to handle sockets or similar things in there that would need to be fixed. (But it is rather unclear to me how we could usefully version them; presumably the program that owns them would recreate them and anyhow they should probably be in /var or /dev.)

The main thing that would need to be fixed is to track permissions and ownership. This should not be too hard to do but will require some new code.

At the moment bzr assumes that working copy files are writable - if you try to update a file that's readonly you will get an error. This is probably a feature in typical use but one *might* want to overwrite it for versioning of /etc. This is probably OK if the admin is going to make any reversions manually.

Hardlinked working files would probably also cause confusion but I don't see any of them in /etc either.


Also see http://joey.kitenet.net/code/etckeeper/ - a collection of tools to let /etc be stored in a git, mercurial, darcs, or bzr repository. It hooks into apt (and other package managers including yum and pacman-g2) to automatically commit changes made to /etc during package upgrades. It tracks file metadata that revison control systems do not normally support, but that is important for /etc, such as the permissions of /etc/shadow. It's quite modular and configurable, while also being simple to use if you understand the basics of working with revision control. etckeeper is available in git at git://git.kitenet.net/etckeeper, or in gitweb. It's packaged in Debian, Unbuntu, Fedora, etc.


CategorySpec

VersionControlledEtc (last edited 2010-09-23 04:17:25 by dhcp198-158)