NetbootManagement
Size: 2158
Comment:
|
Size: 10567
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
##(see the SpecSpec for an explanation) | * '''Launchpad Entry''': https://launchpad.net/distros/ubuntu/+spec/cluster-installation * '''Created''': [[Date(2005-10-27T21:14:37Z)]] by JaneWeideman * '''Contributors''': JaneWeideman, ReinhardTartler, IvanKrstic * '''Packages affected''': * Depends: ConfigurationInfrastructure, AuthenticationInfrastructure | NetworkAuthentication |
Line 3: | Line 7: |
* Created: 15. Okt by ReinhardTartler * Priority: NeedsPriority * People: NeedsLead, NeedsSecond * Contributors: siretart * Interested: * Status: UbzSpecification, BrainDump (then DraftSpecification then EditedSpecification then ApprovedSpecification), DistroSpecification * Branch: * Malone bug: * Packages affected: * Depends: * Dependents: [[FullSearch()]] * BoF sessions: none yet |
Should be renamed to MassInstallationInfrastructure |
Line 19: | Line 11: |
Enabling users to easily setup a computer pool (cluster) with ubuntu installations. | We want to enable users to easily perform mass installations of Ubuntu on a pool of machines by developing a console tool to intelligently manage dhcpd and syslinux configuration. We provide a GUI frontend for Edubuntu and similar environments. A tiny daemon doing housekeeping is also required, but contains very simple functionality and doesn't require root privileges. N.B. We use the term 'cluster' to mean 'a pool of machines'. While this could be a pool of cluster nodes, it doesn't have to be. A more accurate name for the spec would be UbuntuMassInstallation. |
Line 23: | Line 17: |
We already have fully automated installation with d-i using preseeding (and/or kickstart). We plan to have a CentralAuthentication facility. Let's put all parts together so they work out-of-the-box | We already support fully automated installation with d-i using preseeding (and/or kickstart). Let's put all parts together so they are easy to use even for less experienced administrators, and provide a reasonable interface to them. |
Line 27: | Line 22: |
The obvious ones: * Internet cafe clusters, where anonymous users can surf * Educational clusters at universities and/or schools The less obvious ones: * Offices, when all workstations needs to be able to install in a timely manner, but are still customizable enough for the local admin. |
* Andreas is running an Internet cafe on several thick clients. He doesn't want to use the LTSP functionality in Breezy, but wants to install Ubuntu on each machine individually, and be able to trivially reinstall any machine. * Kathrin runs a high-performance computing (HPC) cluster at her university. She needs to be able to install Ubuntu on 300 compute nodes, which are contributed by five departments. Each department's nodes form a named group. Kathrin wants to have each node register itself with the resource manager upon finishing installation. She also wants nodes to stage certain files from the central server locally. The whole pool, or any named group, should be trivial to reinstall by performing the appropriate action in our management tool (in the console or the GUI frontend), and then simply rebooting the nodes. * Reinhard is a teacher at Kathrin's university. He knows that his department's named group in Kathrin's cluster (composed of sixty machines) is currently idle, as there are no jobs submitted. So he wants to use the machines as LTSP thin clients with an LTSP server running on Ubuntu on the same network. Kathrin should be able to select Reinhard's named group, set it to boot into LTSP on the next boot, and then reboot all the machines in the group. |
Line 38: | Line 30: |
To allow the admin to use both ltsp setups as well as cluster setups, we need some interface to dhcp config. We define therefore netboot modes: * computer netboots in ltsp mode. This is what we already have * computer netboots in installer mode. This time, the computer gets fully automatically installed using d-i preseeding and/or kickstart * The default netboot mode is user/admin definable. |
The management tool we discuss is called 'nmt' (netboot management tool). It is able to set the next boot policy for each controlled node, or a named group of nodes. Initially, all nodes belong to the 'unknown' group. |
Line 43: | Line 32: |
The installed system needs to be integrated into the local CentralAuthentication infrastructure. Furthermore we need facilities for keeping the installations uptodate and install extra software packages. | A boot policy is a simple specification of the file that gets sent to a client that is requesting a PXE boot. Example policies include: * boot as a regular thick client from the local disk * boot into the (preseeded/kickstarted) Ubuntu unattended installer * restore a system image (re-image the machine) * save a system image (snapshot the machine) * boot as LTSP thin client, nmt has the first four policies built-in. The LTSP policy would be shipped by the Ubuntu LTSP package. The tool should also support initiating the reboot remotely, but this functionality will likely be handled by the ConfigurationInfrastructure spec. Built-in groups are: * '''unknown''': A machine which has not been seen at all before, is automatically placed in the unknown group. As soon as nmtd detects the machine in the dhcpd log files, it assigns the machine to this group. * '''local boot''': This policy makes the machine boot from local disk. This policy is immutable, and only available with machines that have been previously installed by the unattended Ubuntu installer, also controlled by nmt. All computers supposed to be controlled by nmt would be set to PXE boot. We operate under the assumption that once turned on, it would be impractical to turn PXE booting off on a per-machine basis (as is certainly the case with, for example, computing clusters). This is why we have to provide a method to boot a client from the local disk, even though it's attempting a PXE boot from the server. The only time a machine that's attempting PXE boots _should_ boot into its local disk, is if it's a thick client machine (possibly a HPC compute node) that's previously had Ubuntu installed on it through our automated preseed/kickstart installer. After consulting with ColinWatson, we decided that the automated installer, after finishing the stage1 install, should send a notification to the installation server that specifies its root device. The installation server keeps a mapping of MAC addresses to root devices for all automatically installed machines. Upon first receiving such notification for a machine that was previously in the '''unknown''' nmt group, the installation server automatically removes the machine from the unknown group, and places it into the built-in '''local boot''' group. The 'boot to local disk' policy hence depends on the root device mapping, and allows us to serve via PXE a syslinux image which simply chains to the bootloader on the root device specified in the mapping. The notification at the end of the first stage of the installer is received on the installation server by a tiny daemon called nmtd. nmtd runs as non-root, and its sole purposes are to receive stage1 installation notifications and parse the dhcpd logs in real-time to provide nmt with up-to-date information. nmt is a CLI tool; a GUI frontend is available for less experienced system administrators, who want to avoid dropping into the shell to configure things. The tool supports the following actions: * Assign a name to a machine * Assign a machine to a group * Set the netboot policy on next boot for a machine or a group * List all known machines, or all machines belonging to a group == Addressing the use cases == Here we explain how the tools we will build, nmt and nmtd, address each of our use cases. * Andreas, the internet cafe owner from our first use case, will install his main Ubuntu server, and load the nmt GUI tool. He will set the default policy for the 'unknown' group to 'boot the unattended Ubuntu installer'. When he applies the nmt configuration, he can boot the rest of the computers in his internet cafe. Because of the 'unknown' group policy, the machines will get Ubuntu automatically installed. After the stage1 of install, the installer will send its root device to nmtd on the server, which will automatically place it into the 'local boot' named group. On subsequent reboots, the auto-installed machines will be provided a PXE boot image that instructs them to boot from local disk. * Kathrin installs her main Ubuntu server, customizes the kickstart file for her cluster to perform file staging and registration with the resource manager, and turns on her 300 compute nodes. They are installed fully automatically. Then she creates named groups for the five departments, and reassigns the appropriate machines from the 'local boot' group (where they wre automatically placed at the end of the stage1 installer) to the groups she just created. She sets the default policy for all department groups to 'boot from local disk'. She has a fully operational HPC cluster. * Kathrin places Reinhard in sudoers on her installation server, which allows him to fire up nmt and set the default boot policy for his department's group to 'boot LTSP', and back to 'boot from local disk' when he's done. Reinhard is able to easily use his department machines as thin clients when they're not being used as compute nodes. == nmt interface design == The nmt interface is a simple table: | IP | MAC | Name | Group | Boot policy | Actions | | <dynamic> | 00:01:02... | unknown | unassigned | <boot locally> | (reboot now!) | | 10.2.3.4 | 00:02:03:... | unknown | chinstrap | <reinstall next boot> | (reboot now!) | |
Line 49: | Line 84: |
* Implement managment facilities using cfengine. | * Implement a GUI Tool using these interfaces, so that a local admin can register machines and defines the netboot method. Takes kickstart or preseed files form the admin and enables fully (or semi) automated installs. |
Line 53: | Line 89: |
=== Data preservation and migration === | Both nmt and nmtd will be written in Python. They will use a SQLite database to share state. The autoamtic stage1 installer is modified to send completion notification to the installation server. '''Client side post install script to update server:''' '''Server-side daemon to listen for client update messages and dynamically update pxe boot status to allow local boot after install''' === Data preservation and migration === |
Line 58: | Line 101: |
* CentralAuthentication | * ConfigurationInfrastructure |
Line 61: | Line 104: |
* NetworkwideUpdates * NetworkAuthentication == Notes == - nmt could provide an easy graphical interface to configure fixed ip addresses in the dhcpd.conf - Eventually, we will need to provide integration with NetworkwideUpdates and NetworkAuthentication, which will both be hooked into the post-installation stage of the automatic installer. - admin can design custom kickstart/presed file. we'll generate one at the end of the installationserver install, which duplicates a standard ubuntu install using defaults given at install time. - the default password of the default preseed file will be random and presented in nmt ajmitch offers help with implementation. Nicolas Kassis offers a testbed for nmt. New Policy: Image restoring. Something which is netbooting to restore a preinstalled image. Another policy could create such an image Need to send to main server: just where to locate the next bootloader to chain to (keep in mind having to potentially translate grub syntax for locating the bootloader) - modify grub/lilo installers to put the final install device somewhere we can read it from - remind cjwatson |
|
Line 63: | Line 130: |
---- == Comments == - at the moment, Edubuntu ships a default dhcpd.conf that allows LTSP clients to boot without any special configuration. ideally, this dhcpd.conf would be replaced with a simple invocation to 'nmt' specifying the default policy for unknown machines is 'boot LTSP'. |
Launchpad Entry: https://launchpad.net/distros/ubuntu/+spec/cluster-installation
Created: Date(2005-10-27T21:14:37Z) by JaneWeideman
Contributors: JaneWeideman, ReinhardTartler, IvanKrstic
Packages affected:
Depends: ConfigurationInfrastructure, AuthenticationInfrastructure | NetworkAuthentication
Should be renamed to MassInstallationInfrastructure
Summary
We want to enable users to easily perform mass installations of Ubuntu on a pool of machines by developing a console tool to intelligently manage dhcpd and syslinux configuration. We provide a GUI frontend for Edubuntu and similar environments. A tiny daemon doing housekeeping is also required, but contains very simple functionality and doesn't require root privileges.
N.B. We use the term 'cluster' to mean 'a pool of machines'. While this could be a pool of cluster nodes, it doesn't have to be. A more accurate name for the spec would be UbuntuMassInstallation.
Rationale
We already support fully automated installation with d-i using preseeding (and/or kickstart). Let's put all parts together so they are easy to use even for less experienced administrators, and provide a reasonable interface to them.
Use cases
- Andreas is running an Internet cafe on several thick clients. He doesn't want to use the LTSP functionality in Breezy, but wants to install Ubuntu on each machine individually, and be able to trivially reinstall any machine.
- Kathrin runs a high-performance computing (HPC) cluster at her university. She needs to be able to install Ubuntu on 300 compute nodes, which are contributed by five departments. Each department's nodes form a named group. Kathrin wants to have each node register itself with the resource manager upon finishing installation. She also wants nodes to stage certain files from the central server locally. The whole pool, or any named group, should be trivial to reinstall by performing the appropriate action in our management tool (in the console or the GUI frontend), and then simply rebooting the nodes.
- Reinhard is a teacher at Kathrin's university. He knows that his department's named group in Kathrin's cluster (composed of sixty machines) is currently idle, as there are no jobs submitted. So he wants to use the machines as LTSP thin clients with an LTSP server running on Ubuntu on the same network. Kathrin should be able to select Reinhard's named group, set it to boot into LTSP on the next boot, and then reboot all the machines in the group.
Design
The management tool we discuss is called 'nmt' (netboot management tool). It is able to set the next boot policy for each controlled node, or a named group of nodes. Initially, all nodes belong to the 'unknown' group.
A boot policy is a simple specification of the file that gets sent to a client that is requesting a PXE boot. Example policies include:
- boot as a regular thick client from the local disk
- boot into the (preseeded/kickstarted) Ubuntu unattended installer
- restore a system image (re-image the machine)
- save a system image (snapshot the machine)
- boot as LTSP thin client,
nmt has the first four policies built-in. The LTSP policy would be shipped by the Ubuntu LTSP package. The tool should also support initiating the reboot remotely, but this functionality will likely be handled by the ConfigurationInfrastructure spec.
Built-in groups are:
unknown: A machine which has not been seen at all before, is automatically placed in the unknown group. As soon as nmtd detects the machine in the dhcpd log files, it assigns the machine to this group.
local boot: This policy makes the machine boot from local disk. This policy is immutable, and only available with machines that have been previously installed by the unattended Ubuntu installer, also controlled by nmt.
All computers supposed to be controlled by nmt would be set to PXE boot. We operate under the assumption that once turned on, it would be impractical to turn PXE booting off on a per-machine basis (as is certainly the case with, for example, computing clusters). This is why we have to provide a method to boot a client from the local disk, even though it's attempting a PXE boot from the server.
The only time a machine that's attempting PXE boots _should_ boot into its local disk, is if it's a thick client machine (possibly a HPC compute node) that's previously had Ubuntu installed on it through our automated preseed/kickstart installer. After consulting with ColinWatson, we decided that the automated installer, after finishing the stage1 install, should send a notification to the installation server that specifies its root device. The installation server keeps a mapping of MAC addresses to root devices for all automatically installed machines. Upon first receiving such notification for a machine that was previously in the unknown nmt group, the installation server automatically removes the machine from the unknown group, and places it into the built-in local boot group.
The 'boot to local disk' policy hence depends on the root device mapping, and allows us to serve via PXE a syslinux image which simply chains to the bootloader on the root device specified in the mapping.
The notification at the end of the first stage of the installer is received on the installation server by a tiny daemon called nmtd. nmtd runs as non-root, and its sole purposes are to receive stage1 installation notifications and parse the dhcpd logs in real-time to provide nmt with up-to-date information.
nmt is a CLI tool; a GUI frontend is available for less experienced system administrators, who want to avoid dropping into the shell to configure things.
The tool supports the following actions:
- Assign a name to a machine
- Assign a machine to a group
- Set the netboot policy on next boot for a machine or a group
- List all known machines, or all machines belonging to a group
Addressing the use cases
Here we explain how the tools we will build, nmt and nmtd, address each of our use cases.
- Andreas, the internet cafe owner from our first use case, will install his main Ubuntu server, and load the nmt GUI tool. He will set the default policy for the 'unknown' group to 'boot the unattended Ubuntu installer'. When he applies the nmt configuration, he can boot the rest of the computers in his internet cafe. Because of the 'unknown' group policy, the machines will get Ubuntu automatically installed. After the stage1 of install, the installer will send its root device to nmtd on the server, which will automatically place it into the 'local boot' named group. On subsequent reboots, the auto-installed machines will be provided a PXE boot image that instructs them to boot from local disk.
- Kathrin installs her main Ubuntu server, customizes the kickstart file for her cluster to perform file staging and registration with the resource manager, and turns on her 300 compute nodes. They are installed fully automatically. Then she creates named groups for the five departments, and reassigns the appropriate machines from the 'local boot' group (where they wre automatically placed at the end of the stage1 installer) to the groups she just created. She sets the default policy for all department groups to 'boot from local disk'. She has a fully operational HPC cluster.
- Kathrin places Reinhard in sudoers on her installation server, which allows him to fire up nmt and set the default boot policy for his department's group to 'boot LTSP', and back to 'boot from local disk' when he's done. Reinhard is able to easily use his department machines as thin clients when they're not being used as compute nodes.
nmt interface design
The nmt interface is a simple table:
| IP | MAC | Name | Group | Boot policy | Actions | | <dynamic> | 00:01:02... | unknown | unassigned | <boot locally> | (reboot now!) | | 10.2.3.4 | 00:02:03:... | unknown | chinstrap | <reinstall next boot> | (reboot now!) |
Implementation
- Define interface to dhcpd config to define netboot actions
- Define interface to create/update/edit preseed configs
- Implement a GUI Tool using these interfaces, so that a local admin can register machines and defines the netboot method. Takes kickstart or preseed files form the admin and enables fully (or semi) automated installs.
Code
Both nmt and nmtd will be written in Python. They will use a SQLite database to share state. The autoamtic stage1 installer is modified to send completion notification to the installation server.
Client side post install script to update server:
Server-side daemon to listen for client update messages and dynamically update pxe boot status to allow local boot after install
=== Data preservation and migration ===
Outstanding issues
Needs integration with:
- LTSP (see [:LTSPHowto])
Notes
- nmt could provide an easy graphical interface to configure fixed ip addresses in the dhcpd.conf
- Eventually, we will need to provide integration with NetworkwideUpdates and NetworkAuthentication, which will both be hooked into the post-installation stage of the automatic installer.
- admin can design custom kickstart/presed file. we'll generate one at the end of the installationserver install, which duplicates a standard ubuntu install using defaults given at install time.
- the default password of the default preseed file will be random and presented in nmt
ajmitch offers help with implementation. Nicolas Kassis offers a testbed for nmt.
New Policy: Image restoring. Something which is netbooting to restore a preinstalled image. Another policy could create such an image
Need to send to main server: just where to locate the next bootloader to chain to (keep in mind having to potentially translate grub syntax for locating the bootloader) - modify grub/lilo installers to put the final install device somewhere we can read it from - remind cjwatson
BoF agenda and discussion
Comments
- - at the moment, Edubuntu ships a default dhcpd.conf that allows LTSP clients to boot without any special configuration. ideally, this dhcpd.conf would be replaced with a simple invocation to 'nmt' specifying the default policy for unknown machines is 'boot LTSP'.
NetbootManagement (last edited 2008-08-06 16:24:21 by localhost)