Research Cluster

The first major task of my new position (Assistant Project Scientist) involved setting up a small cluster of GNU/Linux machines. These machines will be used for testing the ensemble analysis software framework that we will be developing. In this post, I’ll briefly describe the current hardware and software configuration of the cluster.

The cluster consists of eight worker nodes and one head node. The worker nodes are Supermicro MicroBlade servers (X9SCD-F). A 3.3 GHz quad-core Xeon and 32 GB of RAM are installed on each blade. The head node is temporarily a 16-core AMD Magny Cours system with 128 GB of RAM. I may write a separate post describing the permanent head node once the parts arrive and the machine assembled. The nine nodes communicate over a gigabit Ethernet network, based on a 24-port unmanaged Cisco Small Business 100 series switch.

The worker nodes and switch are currently on my desk:

IMG_20140723_161012

IMG_20140723_161027

IMG_20140723_161036

Settings and Software

Since the head node will also be used as a workstation, it is running the desktop version of Ubuntu 14.04.1. The worker nodes are running the server version of the same operating system.

The head node is set up as a network gateway so that all other computers on the local network can have Internet access. DHCP software is also installed on the head node so that other computers on the network (i.e. computers other than the cluster nodes) can be assigned IP addresses automatically. Instructions for setting up a gateway can be found here.

The file /etc/hosts for worker node-0 currently looks like this:
127.0.0.1       localhost
 
192.168.0.1   foam   foam.cosmo-cluster.net
192.168.0.100 node-0 node-0.cosmo-cluster.net
192.168.0.101 node-1 node-1.cosmo-cluster.net
192.168.0.102 node-2 node-2.cosmo-cluster.net
192.168.0.103 node-3 node-3.cosmo-cluster.net
192.168.0.104 node-4 node-4.cosmo-cluster.net
192.168.0.105 node-5 node-5.cosmo-cluster.net
192.168.0.106 node-6 node-6.cosmo-cluster.net
192.168.0.107 node-7 node-7.cosmo-cluster.net
 
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

where “foam” is the name of the head node.

The corresponding /etc/network/interfaces file looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
 
# The loopback network interface
auto lo
iface lo inet loopback
 
# The primary network interface
auto p1p1 
iface p1p1 inet static
       address 192.168.0.100
       netmask 255.255.255.0
       network 192.169.0.0
       broadcast 192.168.0.255
       gateway 192.168.0.1
       dns-nameservers 192.168.0.1
       dns-search cosmo-cluster.net

Passwordless SSH has been set up so that various programs can automatically log into machines. Currently, I’m just using the standard OpenSSH that comes with Ubuntu, but I’ll eventually install the HPN-SSH patched version.

For software administration (configuration management), I am using Ansible. I selected Ansible after reading about several alternative configuration management software packages. As far as I can tell, Ansible is the most elegant solution and it’s probably among the easiest to use.

In Ansible’s “hosts” file (/etc/ansible/hosts), I have included the following statement:

[nodes]
node-[0:7]

This allows me to address all of the worker nodes simultaneously, using the group label, “nodes.” For example, the following ad hoc command, displays the system time on all nodes by running the “date” command:

Selection_002

In addition to running ad hoc commands from the command line, it is possible to create rather intricate “playbooks” to automate many tasks easily.

Hadoop

I have installed Hadoop 2.4.1 and I am using Oracle Java 7. The hadoop software is installed in /usr/local/hadoop/. In my .bashrc file, I have the following:

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export HADOOP_INSTALL=/usr/local/hadoop
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"
 
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

The hadoop configuration files in /usr/local/hadoop/etc/hadoop/ look like this:

core-site.xml:
<configuration>
   <property>
   <name>fs.defaultFS</name>
   <value>hdfs://foam/</value>
   </property>
   <property>
   <name>io.file.buffer.size</name>
   <value>4096</value>
   </property>
   <property>
   <name>hadoop.tmp.dir</name>
   <value>/data/temp</value>
   </property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
   <name>dfs.replication</name>
   <value>2</value>
   <description>replication</description>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:///usr/local/hadoop_store/hdfs/namenode</value>
   <description>namenode directory</description>
</property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:///data/hdfs/datanode</value>
   <description>datanode directory</description>
 </property>
 <property>
   <name>dfs.blocksize</name>
   <value>134217728</value>
   <description>file system block size</description>
 </property>
 <property>
   <name>dfs.datanode.data.dir.perm</name>
   <value>777</value>
 </property>
</configuration>
yarn-site.xml:
<configuration>
 
<!-- Site specific YARN configuration properties -->
<property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
</property>
<property>
   <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
   <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>foam:8030</value>
</property>
<property>
   <name>yarn.resourcemanager.address</name>
   <value>foam:8032</value>
</property>
<property>
   <name>yarn.resourcemanager.webapp.address</name>
   <value>foam:8088</value>
</property>
<property>
   <name>yarn.resourcemanager.resource-tracker.address</name>
   <value>foam:8031</value>
</property>
<property>
   <name>yarn.resourcemanager.admin.address</name>
   <value>foam:8033</value>
</property>
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>foam</value>
</property>
<property>
   <name>yarn.scheduler.maximum-allocation-mb</name>
   <value>26624</value>
</property>
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>26624</value>
</property>
 
 
</configuration>
mapred-site.xml:
<configuration>
 <property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
 </property>
</configuration>

And hadoop-env.sh was modified to contain this:

# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-oracle

I am trying to build 64-bit native libraries for Hadoop (the libraries that go in /usr/local/hadoop/lib/native). The standard compiled version that is available for download comes with 32-bit native libraries. I have managed to create a 64-bit libhadoop.so, but not libhdfs.so. Hadoop works without these libraries (it just uses libraries that are already installed with Oracle Java 7), but it evidently runs faster when it uses the Hadoop-specific “native” versions.

MPI and NFS

NFS and the MPICH2 implementation of MPI were installed by following the guide here.

Next Steps

Next, I plan to experiment with Mesos and Hama.

25 Responses to “Research Cluster”

  1. clickhere Says:

    Wish to go abroad and live a settled life…We are there to assisit you ….

  2. essay serv?ces Says:

    The society members should respect the teachers of the society and choose good and quality teachers for their children’s. Because without the good and quality education the society members cannot able to make their children’s successful.

  3. vasu Says:

    Hadoop has moved far beyond its beginnings in web indexing and is now used in many businesses for a huge variety of tasks that all share the common theme of lots of variety, Hadoop training in bay area (total space occupied by something) and speed of data both structured and (without rules, schedules, etc.). It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop trainings in bay area with an excellent and real time (teachers/thinking ability). Our Hadoop training in bayarea course is designed as per the current IT industry needed thing. Apache Hadoop is having very good demand in the market, huge number of job openings are there in the IT world.

  4. LocalCast Says:

    Well informative posting thanks for sharing,

  5. instagonline Says:

    Thank you for sharing! I'm looking for it!
    instagram search

  6. Check Cashing Says:

    Progressions in innovation have made life a ton less demanding for every single one of us. This is valid for all parts of our lives, even in the region of our accounts.

  7. Sandrine Block Says:

    Amazon Prime: Elevating your viewing pleasure without emptying your wallet.

  8. kmtfirm Says:

    I have read some excellent stuff here Definitely value bookmarking for revisiting I wonder how much effort you put to make the sort of excellent informative website

  9. magzineusa Says:

    you are in reality a just right webmaster The site loading velocity is incredible It seems that you are doing any unique trick In addition The contents are masterwork you have performed a wonderful task on this topic

  10. soapertv Says:

    Wonderful beat I wish to apprentice while you amend your web site how could i subscribe for a blog web site The account aided me a acceptable deal I had been a little bit acquainted of this your broadcast provided bright clear idea

  11. Ledger Live Says:

    Ledger Live

  12. mycroxyproxy Says:

    My brother recommended I might like this web site He was totally right This post actually made my day You cannt imagine just how much time I had spent for this information Thanks

  13. rubmd Says:

    Nice blog here Also your site loads up fast What host are you using Can I get your affiliate link to your host I wish my web site loaded up as quickly as yours lol

  14. Baccarat Says:

    of course like your website but you have to check the spelling on several of your posts A number of them are rife with spelling issues and I in finding it very troublesome to inform the reality on the other hand I will certainly come back again

  15. airhostess Says:

    Somebody essentially help to make significantly articles Id state This is the first time I frequented your web page and up to now I surprised with the research you made to make this actual post incredible Fantastic job

  16. temp mail Says:

    Your blog is a testament to your expertise and dedication to your craft. I’m constantly impressed by the depth of your knowledge and the clarity of your explanations. Keep up the amazing work!

  17. Ny weekly Says:

    Ny weekly Pretty! This has been a really wonderful post. Many thanks for providing these details.

  18. Mygreat learning Says:

    Mygreat learning Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.

  19. https://phonecase.skin Says:

    I just could not leave your web site before suggesting that I really enjoyed the standard information a person supply to your visitors Is gonna be again steadily in order to check up on new posts

  20. mediaticas Says:

    Excellent blog here Also your website loads up very fast What web host are you using Can I get your affiliate link to your host I wish my web site loaded up as quickly as yours lol

  21. bigduffers Says:

    My brother suggested I might like this website He was totally right This post actually made my day You cannt imagine just how much time I had spent for this information Thanks

  22. ?????? GRP Says:

    ?????? ???????? ???? ?? ?????? ?? ???? ????? ????? ???? ???? ?????? ???????? ???? ???????? ?? ??????? ??????? ?????? ???? ????? ?? ?????? ??????????. ?????? ???????? ???? ????? ?????? ?????? ????????? ?????????? ??? ???? ??????? ????? ???????? ??????. ?????? ?? ???? ???? ??????? ??????? ?? ??????? ???? ???? ????? ???? ??? ??????????? ???????? ??????? ?????? ??????????? ?????? ?????? ??? ??????? ?????? ???????. ????? ?? ????????? ??? ?????? ???????? ???? ?????? ???? ?????? ?????? ?????????? ??? elitepipeiraq.com. ???????? ????????? ??????? ?????? ?????? ??????? ??????? ?????? ??????? ?? ???? ????? ??????.

  23. ?????? ??????? ? ??? Says:

    ???????????????? ????????? ????? ?? ??????? ??????? ??????? ? ??????? ?? ???. ?? ??????????: ????????? ?????? ? ?????? ???? ??????? ?????????? ???????? ????????????? ?????? ?????????? ? ??????? ??? ? ??????? ?? ???

  24. kalorifer sobas? Says:

    Kalorifer Sobas? odun, kömür, pelet gibi yak?tlarla çal??an ve ?s?tma i?levi gören bir soba türüdür. Kalorifer Sobas? içindeki yak?t?n yanmas?yla olu?an ?s?y? do?rudan çevresine yayar ve ayn? zamanda suyun ?s?nmas?n? sa?lar.

  25. exploreuaeonline Says:

    I was recommended this website by my cousin I am not sure whether this post is written by him as nobody else know such detailed about my trouble You are amazing Thanks

Leave a Reply