UbuntuOnCluster

Revision 3 as of 2005-07-21 00:03:47

Clear message

NOT FINISHED YET

This page documents the process of automatically installing ubuntu on a cluster of machines. This means one machine will be setup manually as install server and all others will be installed when booting. The version used here is hoary. As installer I used only the debian-installer (The default Ubuntu installer). So no tricks like FAI or kickstart are needed.

Prerequisites on the server:

Stage 1: Preparing DHCP & PXE booting

First you will have to know the mac addresses of all machines, so they will get unique and constand IP addresses and hostnames. For me this was easy since there was a list of them available.

Now install dhcp3-server and tftpd-hpa

aptitude install dhcp3-server tftpd-hpa

Once you have this list, you can edit your /etc/dhcp3/dhcpd.conf, an example can be found [http://kaarsemaker.net/files/ubuntu-cluster here]; it's self-explaining.

PXE booting requires that the .iso file is mounted locally, I mounted it under /var/lib/tftpboot/ubuntu/

{{{mkdir /var/lib/tftpboot/ubuntu echo '/data/ubuntu-5.04-install-i386.iso /var/lib/tftpboot/ubuntu/ auto loop' >> /etc/fstab mount -a}}}

Next step is setting up the PXE config. I created two files: one for installing and one for booting from the local disk (ie: booting the installed system. Create /var/lib/tftpboot/pxelinux.cfg and put these files there: [http://kaarsermaker.net/files/ubuntu-cluster/default default] [http://kaarsermaker.net/files/ubuntu-cluster/bootlocal bootlocal]

As you can see, the default action is to run the installer. To save some space on the kernel command line (space is limited), create symlinks to relevant files:

{{{cd /var/lib/tftpboot ln -s /ubuntu/ubuntu/net-install.....somewhere etc... }}}

You can see that in the example config files these symlinks are used.

Stage 2: Setting up nis and nfs

For cluster machines, nis and nfs are usually used to share login information and parts of the filesystem. So you need to install both on the server.

aptitude install nis nfs-kernel-server

Note: the nis package (not nis itself) is quite buggy, it will try to start ypbind even though you did not tell it to. It also completely ignores preseed, so in a following step we will create a new version of this package. For now it will do, you just have to wait a bit for ypbind to time out.

When the nis setup asks for a domain, pick one you like. As soon as ypbind times out, stop nis again with

invoke-rc.d nis stop

Now you need to edit /etc/default/nis and enable the nis server [http://kaarsemaker.net/files/ubuntu-cluster/nis (example)]. You also need to initialize the nis database with

/usr/lib/yp/ypinit -m

You can now start the nis services again

invoke-rc.d nis start

For NFS, you need to edit /etc/exports in order to export required parts of the filesystem. On my cluster I chose to export /home and /data, so the exports file looks like

TODO

Now restart the nfs server.

invoke-rc.d nfs restart

Stage 3: Setting up local mirror and proxy

Installing from a local mirror and using an http proxy for the rest greatly improves the speed of subsequent installs. Since apt-proxy is quite broken, I chose to use squid as a generic http proxy. For the recovering the reboot part I also needed php, so I installed that too. (A CGI script would have worked just fine here, but i'm more familiar with php)

aptitude install apache2 libapache2-mod-php4 squid

My cluster has only one external IP address, so only the master server is connected to the internet. The other machines are connected only to the master (and via NAT they can reach the net). Because I don't want to be a public proxy, I told the squid installer to only listen on eth1 (the internal interface). You can tell apache to do so to by editing /etc/apache2/ports.conf.

{{{invoke-rc.d apache2 stop echo '192.168.0.1:80' > /etc/apache2/ports.conf echo '127.0.0.1:80' >> /etc/apache2/ports.conf invoke-rc.d apache start}}}

Yous should also edit the squid config. An [http://kaarsemaker.net/files/ubuntu-cluster/squid.conf example] (and [http://kaarsemaker.net/files/ubuntu-cluster/squid.conf.diff diff]) can be fond on my homepage.

Stage 4: preseed

Having everything in place on the server, we can now take care of the client configuration. The tftpboot will launch the ubuntu installer. This installer usually asks questions, but the answers can be preseeded in a so called preseed file.

TODO: add preseed file

Stage 5: Surviving the reboot

The installer reboots after the basic install, which means that the installer will be launched again. Of course you don't want this, which is why I created a so called registration system. As you can see in the preseed file, the preseed/late-command has been set to wget http://192.168.0.1/register.php. This does nothing on the client side, but the php script creates a PXE boot file for this machine which instructs it to boot from the local drive. If you want to reinstall a certain machine all you have to do is remove the associated PXE boot file and it will use the default again.

In order for this to work, the www-data user must have write access to /var/lib/tftpboot/pxelinux.cfg

chown :www-data /var/lib/tftpboot/pxelinux.cfg
chmod g+w $_

This is the register.php script:

<?php
  function _dechex($val) { return sprintf('%02s', dechex(val)); }
  $host = implode('',array_map(_dechex,explode('.',$_SERVER['HTTP_HOST'])));
  /* Copy is needed here, symlinks somehow won't work */
  copy('/var/lib/tftpboot/pxelinux.cfg/bootlocal', "/var/lib/tftpboot/pxelinux.cfg/$host");
?>

Stage 6: postinstall

To be written...

CategoryDocumentation