ReplacementInit

Differences between revisions 2 and 3
Revision 2 as of 2006-06-25 10:53:52
Size: 10094
Editor: 83-216-156-196
Comment: typos
Revision 3 as of 2006-06-29 08:56:56
Size: 10089
Editor: 83-216-156-196
Comment: proofreading
Deletions are marked like this. Additions are marked like this.
Line 14: Line 14:
The move to the 2.6 kernel and all the "hotplug" goodness that it provides has left us with several problems in dapper. Because the kernel can support hardware coming and going, and due to the increase in removable hardware, it's no longer possible to guarantee that particular devices are available at particular point in the boot process. The move to the 2.6 kernel and all the "hotplug" goodness that it provides has left us with several problems in dapper. Because the kernel can support hardware coming and going, and due to the increase in removable hardware, it's no longer possible to guarantee that particular devices are available at a particular point in the boot process.
Line 20: Line 20:
There are many other reasons to replace the init system, described in the use cases below. The specified design is intended to be able to fulfill the most important ones for edgy and be extended to support the rest during future release cycles. There are many other reasons to replace the init system, described in the use cases below. The specified design is intended to be able to fulfil the most important ones for edgy and be extended to support the rest during future release cycles.
Line 28: Line 28:
 * Orli owns an iPod and uses a popular piece of software to download podcasts onto it. He currently has to start the software when he plugs his iPod in, and remember to stop it afterwards. He would rather the system started and stopped the software automatically based on the prescence of his iPod.  * Orli owns an iPod and uses a popular piece of software to download podcasts onto it. He currently has to start the software when he plugs his iPod in, and remember to stop it afterwards. He would rather the system started and stopped the software automatically based on the presence of his iPod.
Line 34: Line 34:
 * Justin is an orindary user with a low-end system. He would rather services and hardware handlers were started only when needed, rather than on all systems.  * Justin is an ordinary user with a low-end system. He would rather services and hardware handlers were started only when needed, rather than on all systems.
Line 44: Line 44:
 * Sayid is an experienced UNIX user, with multiple years of experience. He does not wish to have to relearn that which he has learned already, and would rather contunue using the tools that he is used to and only learn the newer ones when necessary.  * Sayid is an experienced UNIX user, with multiple years of experience. He does not wish to have to relearn that which he has learned already, and would rather continue using the tools that he is used to and only learn the newer ones when necessary.
Line 50: Line 50:
While this specification proposes a new init system, it is not expected that any other services need to be modified immediately and that backwards compatibility should be ensured. This limits the affected parts of the distribution to just a replacement for `sysvinit` and if there is time, `initscripts`. While this specification proposes a new init system, it is not expected that any other services need to be modified immediately; backwards compatibility should be ensured. This limits the affected parts of the distribution to just a replacement for `sysvinit` and, if there is time, `initscripts`.
Line 52: Line 52:
Also while the eventual design includes the potential for replacing `cron`, `at`, `inetd`, etc. with the single daemon, this is not a goal for the edgy releases. Also, while the eventual design includes the potential for replacing `cron`, `at`, `inetd`, etc. with the single daemon, this is not a goal for the edgy release.
Line 60: Line 60:
This design is best described as an ''event-based'' init system; services and tasks are started and stopped because an event they were listening for occurs. Services waiting for `/usr` to be mounted are started once that event has occured and are stopped when there's a need to unmount `/usr` again. The event that causes `/usr` to be mounted would be the necessary block device appearing, or generated when the root-filesystem is mounted read-write (another event) if there is no separate partition. This design is best described as an ''event-based'' init system; services and tasks are started and stopped because an event they were listening for occurs. Services waiting for `/usr` to be mounted are started once that event has occurred and are stopped when there's a need to unmount `/usr` again. The event that causes `/usr` to be mounted would be the necessary block device appearing, or generated when the root-filesystem is mounted read-write (another event) if there is no separate partition.
Line 66: Line 66:
All services waiting on an event are normally started at the same time, this may not always be desirable so services are also permitted to depend on others having previously started; if that has not yet happened, they are held until the dependencies are running or any event causing the service to stop again occurs. Services can indicate that they wish to be started if anything depending on them is waiting for them (just another event), thus providing a form of ''dependency-init'' functionality as well. All services waiting on an event are normally started at the same time. This may not always be desirable, so services are also permitted to depend on others having previously started; if that has not yet happened, they are held until the dependencies are running or any event causing the service to stop again occurs. Services can indicate that they wish to be started if anything depending on them is waiting for them (just another event), thus providing a form of ''dependency-init'' functionality as well.
Line 68: Line 68:
The init daemons job therefore is simply to hold a list of waiting and running services and adjust their state depending on the events that are received. Full-duplex communication with the rest of userspace is maintained so that both events and services can be queried for their state, registered and triggered manually. The init daemon's job therefore is simply to hold a list of waiting and running services and adjust their state depending on the events that are received. Full-duplex communication with the rest of userspace is maintained so that both events and services can be queried for their state, registered and triggered manually.
Line 74: Line 74:
Obviously this is a portentially invasive change to the system that needs to be undertaken carefully so that no regressions occur; therefore the following implementation plan will be followed: Obviously this is a potentially invasive change to the system that needs to be undertaken carefully so that no regressions occur; therefore the following implementation plan will be followed:

Summary

Replace the init daemon from the sysvinit package with a modern event-based system that is better able to guarantee a robust boot process and deal with the events from the modern kernel and removable hardware.

Rationale

The move to the 2.6 kernel and all the "hotplug" goodness that it provides has left us with several problems in dapper. Because the kernel can support hardware coming and going, and due to the increase in removable hardware, it's no longer possible to guarantee that particular devices are available at a particular point in the boot process.

My usual example is that dapper cannot mount USB disks in /etc/fstab because it is not guaranteed that the block device exists at the point in the mount process where that happens.

Another example is that of a network-mounted /usr; the network device needs to be detected, firmware loaded if necessary, any security layer on the connection negotiated and an IP address arranged before the NFS mount can occur. There are work-arounds to this, such as dapper which sleeps in the boot process until /usr is mounted, but they are hacky and an elegant solution is desired.

There are many other reasons to replace the init system, described in the use cases below. The specified design is intended to be able to fulfil the most important ones for edgy and be extended to support the rest during future release cycles.

Use cases

  • Fabian is a power user who wishes to use a USB disk for part of his filesystem. This currently frequently fails because the USB disk sometimes takes longer to initialise than the boot process takes to get to the point where it mounts the filesystem. He would rather the boot process was robust, and the disk was mounted when initialised.
  • Corey is the administrator of a number of servers, and has problems with certain daemons that frequently crash. He would prefer the daemons to be automatically restarted if this happens, to avoid loss of service.
  • Orli owns an iPod and uses a popular piece of software to download podcasts onto it. He currently has to start the software when he plugs his iPod in, and remember to stop it afterwards. He would rather the system started and stopped the software automatically based on the presence of his iPod.
  • Ethan is a software developer. He has a script that he wishes to run hourly, provided that the script is not still running from before. He would rather the task scheduler could take care of that for him, than have to reinvent a lock around the task.
  • Ryan is a database administrator. He wishes the database to be automatically backed up whenever the server is shutdown, whether for upgrade or system reboot. There is currently no way for him to set a task to be run when a service is stopped.
  • Justin is an ordinary user with a low-end system. He would rather services and hardware handlers were started only when needed, rather than on all systems.
  • David is a system administrator. He needs to be able to tell which services failed to start on boot, examine why, and see which services are currently running.
  • Thomas is a system administrator. He frequently gets frustrated that there is no consistency to how tasks are added to the system. A script to perform a task at shutdown must be written and activated completely differently to one performed when the system is shut down.
  • Englebert is a security consultant. He has discovered several problems with processes that run task scripts not providing a consistent environment, including potential problems such as leaving file descriptors open.
  • Hugo is an ordinary user and has to frequently reboot his computer. He would prefer that shutting down and booting up took as little time as possible.
  • Sayid is an experienced UNIX user, with multiple years of experience. He does not wish to have to relearn that which he has learned already, and would rather continue using the tools that he is used to and only learn the newer ones when necessary.
  • Matthieu is a distribution developer who maintains several packages that provide services or perform tasks. He does not want to have to update his packages until he is ready to take advantage of new features or abilities, his existing scripts should continue to work unmodified in their original locations.

Scope

While this specification proposes a new init system, it is not expected that any other services need to be modified immediately; backwards compatibility should be ensured. This limits the affected parts of the distribution to just a replacement for sysvinit and, if there is time, initscripts.

Also, while the eventual design includes the potential for replacing cron, at, inetd, etc. with the single daemon, this is not a goal for the edgy release.

This limitation of scope should make the goal attainable in the necessary time frame.

Design

As the primary focus of this specification is dealing with modern hardware and its "coming and going" nature, neither of the two traditional designs of init systems are appropriate. The linear execution model fails because it becomes necessary to sleep and wait during the process for hardware to be available and the dependency-based model fails because services cause their dependencies to be started, rather than get started because their dependencies have been.

This design is best described as an event-based init system; services and tasks are started and stopped because an event they were listening for occurs. Services waiting for /usr to be mounted are started once that event has occurred and are stopped when there's a need to unmount /usr again. The event that causes /usr to be mounted would be the necessary block device appearing, or generated when the root-filesystem is mounted read-write (another event) if there is no separate partition.

In order to allow for the maximum flexibility, the init daemon does not restrict the set of events that can be triggered; external processes are permitted to trigger events that the daemon was not previously aware of. Simple events are therefore just a string that describes them, e.g. "startup"; events with a value are also permitted, so that "default-route" can be "up" or "down" depending on whether a default route is set or not. A service or task can then indicate it should be run while default-route is up, causing it to be automatically stopped before the network goes away.

The set of services and tasks are also not restricted by the init daemon, and may also be registered by external processes, including by non-root users (they'll be started as the registering user). This allows for future compatibility with other init systems by having a small utility to parse their configuration files and register the events with the daemon with the same semantics.

All services waiting on an event are normally started at the same time. This may not always be desirable, so services are also permitted to depend on others having previously started; if that has not yet happened, they are held until the dependencies are running or any event causing the service to stop again occurs. Services can indicate that they wish to be started if anything depending on them is waiting for them (just another event), thus providing a form of dependency-init functionality as well.

The init daemon's job therefore is simply to hold a list of waiting and running services and adjust their state depending on the events that are received. Full-duplex communication with the rest of userspace is maintained so that both events and services can be queried for their state, registered and triggered manually.

Implementation

Plan

Obviously this is a potentially invasive change to the system that needs to be undertaken carefully so that no regressions occur; therefore the following implementation plan will be followed:

  1. Development of the new init binary's core functionality, and testing locally and for other interested parties.

  2. Development of core companion tools such as shutdown.

  3. Replacement of the sysvinit binary package with the new package, configured to run /etc/init.d/rc at appropriate times so that no existing init scripts need be modified.

This point must be reached before FeatureFreeze with no regressions, or the change will be reverted and deferred to edgy+1.

  1. Replacement of the initscripts binary package and the scripts therein with new scripts that take advantage of the new system. The existing init scripts from other packages will still be run by keeping /etc/init.d/rc

Further plans will wait until edgy+1, and any spare time will be spent on testing and bug fixes rather than attempting to implement additional things that may not be as mature.

Code

The core init daemon and companion tools are to be written in C and be as safe as is humanly possible. It is suggested that the code be reviewed by multiple people such as MartinPitt to ensure security and the general advantage of new eyes on the code.

Data preservation and migration

No other packages need to be modified because the existing /etc/init.d/rc script will be retained; the new daemon will be configured to call this with the appropriate arguments at startup, shutdown and reboot. Run levels will be maintained through compatibility configuration such that init 3 would issue an event causing /etc/init.d/rc 3 to be run.

Packages for which there is an advantage to using the features of the new system may be modified, though that is not part of this specification.


CategorySpec

ReplacementInit (last edited 2011-01-19 05:30:15 by 109-170-137-116)