This page describes the computer cluster used for computational biology and other purposes, commonly known as van-halen. All information regarding the cluster should be put here so that there isn't a single point of user failure.
The cluster is located in the server room in Lilybank Gardens. The hardware currently consists of a Netshelter VL Value Line - 42U 600X1070MM rack, an HP V1810G 48 Port Ethernet Switch (model number J9660A) and 7 Intel SR1690WBR Server Systems. One of the servers is designated as a head node, which has a DVD-ROM built in (van-halen) and the rest are compute nodes.
The cluster is set up so that the connections between the head node and the compute nodes are on a private network, with the head node also connected to the DCS network. For this purpose, the switch has had two VLANs set up, public(1) and private(100). The ports on the switch are untagged, so traffic cannot span the two VLANs. The head node takes care of routing all traffic between the compute nodes and the outside world.
The cluster is running Rocks 6.0, which is based on CentOS 6.2. This distribution is specifically designed for clusters and has the ability to automatically install and update OS images on each of the compute nodes. Details of configuration and administration are in CompBioCluster/ClusterAdministration
Accessing the cluster
The cluster can only be accessed using ssh. In particular, the only node that can be accessed is the head node (van-halen), and the connection must be from inside the university network. Files can be copied over to the head node using scp and it is also possible to copy files inward (from a computer on the university network to the current node you are on).
Home directories are at /home/<username>. In particular, the home directories reside on the head node and are NFS mounted to /home on the compute nodes. This means that it is possible to access files from your home directory on any node. However, there is only roughly 460GB available on /home and there are no quotas in place. This means that you must take care with the amount of files you place in home so as to stop it getting too full and blocking other users from using the cluster. When you are finished using data, please delete it from your directory.
If you need more space when running a job or are running jobs that access files a lot, there are also partitions on each of the compute nodes that may be used. These are mounted on /state/partition1. Files can be copied here ahead of processing, or the results of processing can also be stored here. However, it is not possible to access these partitions from outside the node, so when a job is finished it should copy any results out. Files on these partitions should also be considered continually at risk and may disappear at any time. These partitions are also roughly 460GB in size, so again please be careful what you are placing on them. It is worthwhile knowing that NFS can be quite slow, so if you are accessing files on /home a lot, your jobs can slow right down because of network latency.
There are currently no backups of the cluster, so your data may disappear at any time. Please copy important information somewhere else.
Running jobs on the cluster
Sun Grid Engine (now known as Oracle Grid Engine) version 6.2u5, is used as the job scheduling tool. These tools must be used to run jobs on the cluster; in particular, do not run jobs as standalone processes, either on the head node or on a compute node. This will ensure that cluster resources are shared out fairly and evenly and will ensure that a high throughput is maintained.