Research Cluster

The first major task of my new position (Assistant Project Scientist) involved setting up a small cluster of GNU/Linux machines. These machines will be used for testing the ensemble analysis software framework that we will be developing. In this post, I’ll briefly describe the current hardware and software configuration of the cluster.

The cluster consists of eight worker nodes and one head node. The worker nodes are Supermicro MicroBlade servers (X9SCD-F). A 3.3 GHz quad-core Xeon and 32 GB of RAM are installed on each blade. The head node is temporarily a 16-core AMD Magny Cours system with 128 GB of RAM. I may write a separate post describing the permanent head node once the parts arrive and the machine assembled. The nine nodes communicate over a gigabit Ethernet network, based on a 24-port unmanaged Cisco Small Business 100 series switch.

The worker nodes and switch are currently on my desk:

IMG_20140723_161012

IMG_20140723_161027

IMG_20140723_161036

Settings and Software

Since the head node will also be used as a workstation, it is running the desktop version of Ubuntu 14.04.1. The worker nodes are running the server version of the same operating system.

The head node is set up as a network gateway so that all other computers on the local network can have Internet access. DHCP software is also installed on the head node so that other computers on the network (i.e. computers other than the cluster nodes) can be assigned IP addresses automatically. Instructions for setting up a gateway can be found here.

The file /etc/hosts for worker node-0 currently looks like this:
127.0.0.1       localhost
 
192.168.0.1   foam   foam.cosmo-cluster.net
192.168.0.100 node-0 node-0.cosmo-cluster.net
192.168.0.101 node-1 node-1.cosmo-cluster.net
192.168.0.102 node-2 node-2.cosmo-cluster.net
192.168.0.103 node-3 node-3.cosmo-cluster.net
192.168.0.104 node-4 node-4.cosmo-cluster.net
192.168.0.105 node-5 node-5.cosmo-cluster.net
192.168.0.106 node-6 node-6.cosmo-cluster.net
192.168.0.107 node-7 node-7.cosmo-cluster.net
 
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

where “foam” is the name of the head node.

The corresponding /etc/network/interfaces file looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
 
# The loopback network interface
auto lo
iface lo inet loopback
 
# The primary network interface
auto p1p1 
iface p1p1 inet static
       address 192.168.0.100
       netmask 255.255.255.0
       network 192.169.0.0
       broadcast 192.168.0.255
       gateway 192.168.0.1
       dns-nameservers 192.168.0.1
       dns-search cosmo-cluster.net

Passwordless SSH has been set up so that various programs can automatically log into machines. Currently, I’m just using the standard OpenSSH that comes with Ubuntu, but I’ll eventually install the HPN-SSH patched version.

For software administration (configuration management), I am using Ansible. I selected Ansible after reading about several alternative configuration management software packages. As far as I can tell, Ansible is the most elegant solution and it’s probably among the easiest to use.

In Ansible’s “hosts” file (/etc/ansible/hosts), I have included the following statement:

[nodes]
node-[0:7]

This allows me to address all of the worker nodes simultaneously, using the group label, “nodes.” For example, the following ad hoc command, displays the system time on all nodes by running the “date” command:

Selection_002

In addition to running ad hoc commands from the command line, it is possible to create rather intricate “playbooks” to automate many tasks easily.

Hadoop

I have installed Hadoop 2.4.1 and I am using Oracle Java 7. The hadoop software is installed in /usr/local/hadoop/. In my .bashrc file, I have the following:

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export HADOOP_INSTALL=/usr/local/hadoop
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
 
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

The hadoop configuration files in /usr/local/hadoop/etc/hadoop/ look like this:

core-site.xml:
<configuration>
   <property>
   <name>fs.defaultFS</name>
   <value>hdfs://foam/</value>
   </property>
   <property>
   <name>io.file.buffer.size</name>
   <value>4096</value>
   </property>
   <property>
   <name>hadoop.tmp.dir</name>
   <value>/data/temp</value>
   </property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
   <name>dfs.replication</name>
   <value>2</value>
   <description>replication</description>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:///usr/local/hadoop_store/hdfs/namenode</value>
   <description>namenode directory</description>
</property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:///data/hdfs/datanode</value>
   <description>datanode directory</description>
 </property>
 <property>
   <name>dfs.blocksize</name>
   <value>134217728</value>
   <description>file system block size</description>
 </property>
 <property>
   <name>dfs.datanode.data.dir.perm</name>
   <value>777</value>
 </property>
</configuration>
yarn-site.xml:
<configuration>
 
<!-- Site specific YARN configuration properties -->
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>foam:8030</value>
</property>
<property>
   <name>yarn.resourcemanager.address</name>
   <value>foam:8032</value>
</property>
<property>
   <name>yarn.resourcemanager.webapp.address</name>
   <value>foam:8088</value>
</property>
<property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>foam:8031</value>
</property>
<property>
   <name>yarn.resourcemanager.admin.address</name>
   <value>foam:8033</value>
</property>
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>foam</value>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>26624</value>
</property>
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>26624</value>
</property>
 
 
</configuration>
mapred-site.xml:
<configuration>
 <property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
 </property>
</configuration>

And hadoop-env.sh was modified to contain this:

# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-oracle

I am trying to build 64-bit native libraries for Hadoop (the libraries that go in /usr/local/hadoop/lib/native). The standard compiled version that is available for download comes with 32-bit native libraries. I have managed to create a 64-bit libhadoop.so, but not libhdfs.so. Hadoop works without these libraries (it just uses libraries that are already installed with Oracle Java 7), but it evidently runs faster when it uses the Hadoop-specific “native” versions.

MPI and NFS

NFS and the MPICH2 implementation of MPI were installed by following the guide here.

Next Steps

Next, I plan to experiment with Mesos and Hama.

6 Responses to “Research Cluster”

  1. clickhere Says:

    Wish to go abroad and live a settled life…We are there to assisit you ….

  2. essay serv?ces Says:

    The society members should respect the teachers of the society and choose good and quality teachers for their children’s. Because without the good and quality education the society members cannot able to make their children’s successful.

  3. vasu Says:

    Hadoop has moved far beyond its beginnings in web indexing and is now used in many businesses for a huge variety of tasks that all share the common theme of lots of variety, Hadoop training in bay area (total space occupied by something) and speed of data both structured and (without rules, schedules, etc.). It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop trainings in bay area with an excellent and real time (teachers/thinking ability). Our Hadoop training in bayarea course is designed as per the current IT industry needed thing. Apache Hadoop is having very good demand in the market, huge number of job openings are there in the IT world.

  4. LocalCast Says:

    Well informative posting thanks for sharing,

  5. instagonline Says:

    Thank you for sharing! I'm looking for it!
    instagram search

  6. Check Cashing Says:

    Progressions in innovation have made life a ton less demanding for every single one of us. This is valid for all parts of our lives, even in the region of our accounts.

Leave a Reply