Saturday, April 5, 2008

Ubuntu Cluster - Slave nodes

In this part we'll install and configure the other nodes, in the master node we have a DHCP server and a PXE network boot which serves the netboot system, so we just need to connect the net cable, a keyboard and a monitor to the node, turn on and wait to install the base system.

Options include partition of the hda disk, we use the defaults and put all the system in a partition and default swap size.

We add the general user named beagle and after rebooting, login, become root, change the password and install the SSH server:


sudo su -
passwd
apt-get install openssh-server

Repeat this for all the nodes. The next steps will be only from the master node.


SSH access


You need to create a pair of keys for each user to access any node without a password, first we create it in the general user beagle:

ssh-keygen
cp .ssh/id_pub .ssh/authorized_keys

Root also need this keys, but we don't export the root's home, after you created the keys, you need to copy to all the nodes. This step ask every time for the root's password, don't worry, it'll be the first and last time:

su -
ssh-keygen
cp .ssh/id_pub .ssh/authorized_keys
for NODE in `cat /etc/machines`
do
rsh $NODE mkdir .ssh
rcp .ssh/authorized_keys $NODE:.ssh/authorized_keys
done

In all the next steps you need to access as root in each node.


Exporting HOME


We connect to the node, install NFS packages, change /etc/fstab to include the /home from master node and delete old home files:

ssh nodeXX
apt-get install nfs-common
echo "197.1.1.1:/home /home nfs defaults,auto 0 0" >> /etc/fstab
rm -rf /home/*
mount -a


Hosts adjustments


Edit /etc/hosts to include all the other nodes in the cluster:

197.1.1.1 beagle.local beagle
197.1.1.100 node00.local node00
197.1.1.101 node01.local node01
197.1.1.102 node02.local node02
197.1.1.103 node03.local node03
197.1.1.104 node04.local node04
197.1.1.105 node05.local node05
197.1.1.106 node06.local node06
197.1.1.107 node07.local node07
197.1.1.108 node08.local node08
197.1.1.109 node09.local node09


Install SGE


SGE files are exported in /home/sgeadmin, we install the dependencies, add the user and install:

apt-get install binutils
adduser sgeadmin
/home/sgeadmin/install_execd

Note: Check the values for GID in /etc/passwd and /etc/groups, this must be the same in the master node.


Managing the nodes


Many administrative tasks will be the same for each node, so we create a bash script (/sbin/cluster-fork) to do it:

#!/bin/bash
# cluster-fork COMMANDS
# Script to execute COMMANDS in all nodes in /etc/machines
# Juan Caballero @ Cinvestav 2008
for NODE in `cat /etc/machines`
do
echo $NODE:
rsh $NODE $*
done

Now we can run the same command in all nodes without problems, but maybe you want to run commands in non-interactive mode, for example to upgrade all the node:

cluster-fork apt-get -y update
cluster-fork apt-get -y upgrade


Add users in the cluster


Any user added to the master node will be exported to the other nodes, so we can run the adduser command, remember to have the same UID and GID in all nodes, if you had added the user in the same sequence, you don't have problems, if did not, you must edit all the /etc/passwd and /etc/groups. and don't forget to create valid access keys for passwordless login.


Finally you have a HPC cluster running in Linux Ubuntu, but many steps can be applied in other Linux distros with few changes. I want to run performance test to compare this cluster with the others we have. Maybe later I put some photos.

Wednesday, April 2, 2008

Ubuntu Cluster - Master node

This time I want to show how to build a HPC (High Performance Computing) cluster using Ubuntu Linux.


Before to begin


My Institute (Cinvestav - www.ira.cinvestav.mx) ask me to build a small cluster to general works and it'll be a learning system for bioinformatic. The cluster name is Beagle, they put the hardware and my friend LuisD and I the hard job. The design includes a master node to access, to check status and to send jobs, and 10 slaves nodes with /home exported with NFS, a job queque created with Sun Grid Engine, Ganglia is user to monitor the systems and also include MPI support.


Hardware


  • Master Node: AMD Athlon 64 X2 4200+, 2 Gb RAM, 1 hard disk IDE 80 Gb (hda), 1 hd SATA 320 Gb (sda), 1 external USB hd 1 Tb (sdb), network card nForce 10/100/1000 (eth1), network card PCI Realtek 10/100 (eth2).
  • Slave Nodes X10: AMD Athlon 64 X2 4200+, 2 Gb RAM, 1 hard disk IDE 80 Gb, network card nForce 10/100/1000 (eth0).
  • Network Switch: 24 ports 10/100/1000.
  • Many meters of network cable.


    Install the Master Node


    We have an arch amd64, so we use the Ubuntu Desktop for amd64, we download the ISO file, burn into a CD and install:
  • File System: ext3.
  • Partitions (partition:size:mount_point):

    hda1 : 180 Mb : /boot
    hda2 : 2.0 Gb : swap
    hda4 : 24 Gb : /
    hda5 : 4.6 Gb : /tftpboot
    hda6 : 22 Gb : /var
    hda7 : 22 Gb : /usr
    sda1 : 2.0 Gb : swap
    sda2 : 292 Gb : /home

  • A general user named beagle.
  • Network has the configuration: eth1 197.1.1.1 (to send/receive with the slaves nodes), eth2 10.0.0.114 (to send/receive with out-world across our local intranet).
    Finish the installation, reboot, login and access a terminal to become root and create a password:

    sudo su -
    passwd

    DHCP server


    A DHCP server is build to talk with the slaves nodes:
    apt-get install dhcp3-server

    Now we edit /etc/dhcp3/dhcpd.conf to include the net 197.1.1.0/24 and add the MACs of each machine, include the hostname and use a netboot loader (PXE). Our file looks:

    # dhcp.conf
    # Network for the Beagle Cluster
    # Juan Caballero @ Cinvestav 2008
    ddns-update-style none;
    subnet 197.0.0.0 netmask 255.0.0.0 {
    default-lease-time 1200;
    max-lease-time 1200;
    option routers 197.1.1.1;
    option subnet-mask 255.0.0.0;
    option domain-name "local";
    option domain-name-servers 197.1.1.1;
    option nis-domain "beagle";
    option broadcast-address 197.255.255.255;
    deny unknown-clients;
    allow booting;
    allow bootp;
    if (substring (option vendor-class-identifier, 0, 20)
    = "PXEClient:Arch:00002") {
    # ia64
    filename "elilo.efi";
    next-server 197.1.1.1;
    } elsif ((substring (option vendor-class-identifier, 0, 9)
    = "PXEClient") or
    (substring (option vendor-class-identifier, 0, 9)
    = "Etherboot")) {
    # i386 and x86_64
    filename "pxelinux.0";
    next-server 197.1.1.1;
    } else {
    filename "/install/sbin/kickstart.cgi";
    next-server 197.1.1.1;
    }

    host beagle.local {
    hardware ethernet 00:e0:7d:b4:e1:13;
    option host-name "beagle.local";
    fixed-address 197.1.1.1;
    }
    host node00.local {
    hardware ethernet 00:1b:b9:e2:0d:18;
    option host-name "node00.local";
    fixed-address 197.1.1.100;
    }
    host node01.local {
    hardware ethernet 00:1b:b9:e1:cf:6a;
    option host-name "node01.local";
    fixed-address 197.1.1.101;
    }
    host node02.local {
    hardware ethernet 00:1b:b9:e1:be:6e;
    option host-name "node02.local";
    fixed-address 197.1.1.102;
    }
    host node03.local {
    hardware ethernet 00:1b:b9:cf:f3:55;
    option host-name "node03.local";
    fixed-address 197.1.1.103;
    }
    host node04.local {
    hardware ethernet 00:1b:b9:e2:14:06;
    option host-name "node04.local";
    fixed-address 197.1.1.104;
    }
    host node05.local {
    hardware ethernet 00:1b:b9:ce:85:9a;
    option host-name "node05.local";
    fixed-address 197.1.1.105;
    }
    host node06.local {
    hardware ethernet 00:1b:b9:e2:0c:5f;
    option host-name "node06.local";
    fixed-address 197.1.1.106;
    }
    host node07.local {
    hardware ethernet 00:1b:b9:cf:f7:29;
    option host-name "node07.local";
    fixed-address 197.1.1.107;
    }
    host node08.local {
    hardware ethernet 00:1b:b9:cf:f3:25;
    option host-name "node08.local";
    fixed-address 197.1.1.108;
    }
    host node09.local {
    hardware ethernet 00:1b:b9:e2:14:9f;
    option host-name "node09.local";
    fixed-address 197.1.1.109;
    }
    }

    In the file /etc/defaults/dhcp3-server we wrote the network card to activate DHCP:
    Interfaces="eth1"

    Now we can restart the service:
    /etc/init.d/dhcp3-server restart


    More Network details


    Edit /etc/hosts to include all the nodes, our file looks:

    127.0.0.1 localhost
    197.1.1.1 beagle.local beagle
    197.1.1.100 node00.local node00
    197.1.1.101 node01.local node01
    197.1.1.102 node02.local node02
    197.1.1.103 node03.local node03
    197.1.1.104 node04.local node04
    197.1.1.105 node05.local node05
    197.1.1.106 node06.local node06
    197.1.1.107 node07.local node07
    197.1.1.108 node08.local node08
    197.1.1.109 node09.local node09

    Also we create a text file in /etc/machines with the names of all slaves nodes, this will be used in scripting.

    node00
    node01
    node02
    node03
    node04
    node05
    node06
    node07
    node08
    node09


    NFS server


    Install the packages:
    apt-get nfs-common nfs-kernel-server

    Edit /etc/exports to export /home and /tftpboot:

    /home 197.1.1.0/24(rw,no_root_squash,sync,no_subtree_check)
    /tftpboot 197.1.1.0/24(rw,no_root_squash,sync,no_subtree_check)

    Start the file exportation with:
    exportfs -av


    PXE boot server


    Install tftpd-hpa:
    apt-get install tfptd-hpa

    Edit /etc/defaults/tfptd-hpa to include:

    #Defaults for tftpd-hpa
    RUN_DAEMON="yes"
    OPTIONS="-l -s /tftpboot"

    Now we download the netboot file for Ubuntu amd64:

    cd /tftpboot
    wget http://tezcatl.fciencias.unam.mx/ubuntu/dists/gutsy/main/installer-amd64/current/images/netboot/netboot.tar.gz
    tar zxvf netboot.tar.gz

    Restart the service:
    /etc/init.d/tftpd-hpa restart


    SGE server


    For the Sun Grid Engine we add an user named sgeadmin, and run the installation script, many options are defaults:

    adduser sgemaster
    wget http://gridengine.sunsource.net/download/SGE61/ge-6.1u3-common.tar.gz
    wget http://gridengine.sunsource.net/download/SGE61/ge-6.1u3-bin-lx24-amd64.tar.gz
    tar zxvf ge-6.1u3-common.tar.gz
    tar zxvf ge-6.1u3-bin-lx24-amd64.tar.gz
    ./install-qmaster


    Web server


    Install Apache:
    apt-get install apache2


    Ganglia monitor


    First we install dependencies, download the source and compile ganglia:

    apt-get install rrdtool librrds-perl librrd2-dev php5-gd
    wget http://downloads.sourceforge.net/ganglia/ganglia-3.0.7.tar.gz?modtime=1204128965&big_mirror=0
    tar zxvf ganglia*
    cd ganglia*
    ./configure --enable-gexec --with-gmetad
    make
    mkdir /var/www/ganglia
    cp web/* /var/www/ganglia

    Edit Apache configuration to access ganglia in the file /etc/apache2/sites-enabled/000-default
    Also install the packages:
    apt-get install ganglia-monitor gmetad

    You can modify the default configuration if you edit /etc/gmond.conf and /etc/gmetad.conf.


    Others programs


    Use apt-get or manually compilated packages, for example we add a SSH server, the basic compilers and MPI support:
    apt-get install openssh-server gcc g++ g77 mpich-bin openmpi-bin lam-runtime


    In the next part we'll install the slaves nodes.


    Links:


    Ubuntu http://www.ubuntu.com/
    Debian Clusters http://debianclusters.cs.uni.edu/index.php/Main_Page
    SGE http://gridengine.sunsource.net/
    Ganglia http://ganglia.info/
    NFS http://nfs.sourceforge.net/
    TFTP-HPA http://freshmeat.net/projects/tftp-hpa/
    DHCP http://www.dhcp.org/
    MPICH http://www-unix.mcs.anl.gov/mpi/mpich1/
    OpenMPI http://www.open-mpi.org/
    LAM/MPI http://www.lam-mpi.org/