Starting up the cluster
A few words about the cluster[edit]
First of all, this is whole (or almost) "nfs" mounted cluster.
That means that we have a main server called "TEKLA" ( 10.3.20.254 from inside TEKLA's cluster and 10.3.1.244 from the rest of our LAN) that exports by nfs all the files and folders needed to run each node.
OS is Debian.
NOTE: Don't do a full upgrade of the system without checking what is about to upgrade, because some applications needs older version of software (p.e dacapo needs an older's version of python related software).
All the files are located in /nfsroot/tekla??? as we have one folder containing the all the files from each node, just replace "???" by a node (like tekla001).
Only /scratch and /tmp are in each HDD (remember it should exist in /nfsroot/tekla??? too and must have 777 permissions).
TEKLA's filesystem is "ext3" and /scratch and /tmp from each nodes are "xfs" (but its easy so change this).
How to start the cluster[edit]
As TEKLA's server exports each nodes-system we have to be sure that the necessaries services are up. So at first, TEKLA must to be fully operating before starting up the nodes. Then (as a check) the following services must be up:
dhcp3-server
nfs-kernel-server
tftp (from inetd)
maybe more but I don't remember right now...
Once TEKLA is up & running you can start booting up nodes, don't start more than 10 nodes at time, just waiting for a few seconds will be enough.
That's all.