NetworkWideUpdates
NetworkWideUpdates
Launchpad Entry: https://launchpad.net/distros/ubuntu/+spec/network-wide-updates
Created: 2005-10-25 by MichaelVogt
Contributors: MichaelVogt,YvesJunqueira
Packages affected: python-apt
Introduction
Network Wide Updates provide a framework that allows systems to have a central repository to get their software updates and new packages from.
Rationale
Network Wide Updates enable a framework that allows many systems on a network to get updated software packages from a central repository and allow mangement of the clients from a central server.
Scope and Use Cases
- install security updates for the clients
- roll custom packages
- provide a way to easily see the status of the machines
- Need to conserve bandwidth in large environments (proxy/cache)
- remote conffile managment
- random scripting
- check/react network availability
- remote debconf managment
- remote dist-upgrades
- feedback to the administrator
- adding new entires to the sources.list (through sources.list.d)
- random scripting (with feedback)
Needs discussion
- bandwidth saving with some sort of apt-proxy
Design
The system should be a pull based approach by default. There is a central server that all the clients report to. The server runs a network service based on xml-rpc that the clients trigger on a regular schedule. The client has a application installed that will be triggered by a cron job and that will then report to the server (its status, outdated packages etc) and will ask for commands. The commands will then be run on the client. The commands can be stuff like "install, update, upgrade, sources.list.d manipulation" and possibly more in the future.
This "pull" approach has some advantages over a "push" based approach. There are no open ports on the clients, no issues with firewalls (because the client initiates the connection).
The information that the server needs are at the minimum:
- client identification
- version of the distro on the client
- installed packages and there versions
Various actions can be scheduled on the server for the clients like upgrading or installing. Classes of machines should be definable.
In addtion to the pull based approach a "push" mechanism should be considered (for e.g. emergency upgrades). A push should just trigger a "pull" action.
Security
- Ideally, the server certificate must be manually transferred to the clients by the admin, and always use that certificate for server authentication, in order to avoid trojan attack.
- Most part of the work to be done will use unprivileged user in the client.
The cron would call a maintenance script, run as root, which will check for pending actions and execute them (eg: "install <package>").
- The server xml-rpc interface will not require root.
Implementation Plan
A server that provides a xml-rpc interface must be written. A client that can connect to the server and can understand the commands it gets needs to be developed.
Because of the scope of the spec it should be split into various milestones.
Milestone 1
- client can talk to the server, client figures what to
- do (save-upgrade, upgrade)
- save dist-upgrades (do upgrade, see what's left
- and do a install and see if there are conflicts) for apt
- keep the installation as simple as possilbe
- because if the client fails the system fails horribly
Data Preservation and Migration
The package updates should be tested on a machine before the actual netwide deploy. In the general case, a rollback of a update is not possible because the {pre,post}inst scripts of the package works in general only in the upgrade direction not in the downgrade direction (when e.g. a database file is converted to a new format during a upgrade, a downgrade will result in a unreadable database file for the old version).
Packages Affected
One of the apt-cacher/apt-proxy packages is likely to be used for storing the packages on the server. ssh-server is needed on the clients to make it possible for the server to connect to the clients. The push should happen with a command line application. We may think about writing a front-end in python-gtk for it.
Even more flexible, and scales down well to small or ad-hoc networks or even offline (media carrying) networks (https://wiki.ubuntu.com/OfflineUpdateSpec): The apt file cache (/var/cache/apt) on one machine can be used to distribute the files to other machines on the network. Maybe apt's cache could be even made a valid repository by default. In the meantime we can use package "apt-move" to do that. With this you test install the stuff/updates with your prefered tool/commands on that machine (let it generate a repository) and use it as repository in the /etc/apt/sources.list files on all other machines in your network. --ceg
User Interface Requirements
Since we target network administrators, a command line UI should be sufficient. A optional pygtk interface may be useful. Additionally a webmin (html) kind of UI may be useful, but that opens some issue with security and should be targeted later.
Outstanding Issues
A prototype implementation for the version discussed in UDU was done in the michael.vogt@ubuntu.com--2005/auto-pkg-update--main--0 repository at http://people.ubuntu.com/~mvo/arch/ubuntu
This needs to be totally reworked for the new design.
UPDATE: YvesJunqueira (nictuku) is working on that. Milestone1 is not far. See http://cetico.org/nwu
Comments
cfengine is a piece of software that accomplishes basically all the goals outlined above, and is already in Universe http://packages.ubuntu.com/dapper/admin/cfengine2. Right now it is lacking a nice UI and good apt integration, but I am already using it to keep a lab of ~50 computers up-to-date. It seems that time would be better spent improving cfengine rather than writing something from scratch, and it works very nicely.
TimoAaltonen: On the other hand, puppet is said to be a "cfengine-killer" and I think it should be investigated. A comparison of the two. Combine this with nagios, and you have a serious tool for the enterprises.
Similar projects are Bcfg2, ISconf and radmind. Probably they are worth an investigation.
The Andalusian Education Department is actually doing something like this http://www.juntadeandalucia.es/averroes/guadalinex/ (sorry, page only in spanish). They are pushing updates to 110.000 desktops in some 1100 schools/servers. If interested, please email me. -JuanConde
See also the related spec: UpdateServer, which incorporates some enterprise features via LDAP. -NealMcBurnett
NetworkWideUpdates (last edited 2008-08-06 16:38:49 by localhost)