Research Cluster
The first major task of my new position (Assistant Project Scientist) involved setting up a small cluster of GNU/Linux machines. These machines will be used for testing the ensemble analysis software framework that we will be developing. In this post, I’ll briefly describe the current hardware and software configuration of the cluster.
The cluster consists of eight worker nodes and one head node. The worker nodes are Supermicro MicroBlade servers (X9SCD-F). A 3.3 GHz quad-core Xeon and 32 GB of RAM are installed on each blade. The head node is temporarily a 16-core AMD Magny Cours system with 128 GB of RAM. I may write a separate post describing the permanent head node once the parts arrive and the machine assembled. The nine nodes communicate over a gigabit Ethernet network, based on a 24-port unmanaged Cisco Small Business 100 series switch.
The worker nodes and switch are currently on my desk:
Settings and Software
Since the head node will also be used as a workstation, it is running the desktop version of Ubuntu 14.04.1. The worker nodes are running the server version of the same operating system.
The head node is set up as a network gateway so that all other computers on the local network can have Internet access. DHCP software is also installed on the head node so that other computers on the network (i.e. computers other than the cluster nodes) can be assigned IP addresses automatically. Instructions for setting up a gateway can be found here.
The file /etc/hosts for worker node-0 currently looks like this:127.0.0.1 localhost 192.168.0.1 foam foam.cosmo-cluster.net 192.168.0.100 node-0 node-0.cosmo-cluster.net 192.168.0.101 node-1 node-1.cosmo-cluster.net 192.168.0.102 node-2 node-2.cosmo-cluster.net 192.168.0.103 node-3 node-3.cosmo-cluster.net 192.168.0.104 node-4 node-4.cosmo-cluster.net 192.168.0.105 node-5 node-5.cosmo-cluster.net 192.168.0.106 node-6 node-6.cosmo-cluster.net 192.168.0.107 node-7 node-7.cosmo-cluster.net # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters |
where “foam” is the name of the head node.
The corresponding /etc/network/interfaces file looks like this:
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto p1p1 iface p1p1 inet static address 192.168.0.100 netmask 255.255.255.0 network 192.169.0.0 broadcast 192.168.0.255 gateway 192.168.0.1 dns-nameservers 192.168.0.1 dns-search cosmo-cluster.net |
Passwordless SSH has been set up so that various programs can automatically log into machines. Currently, I’m just using the standard OpenSSH that comes with Ubuntu, but I’ll eventually install the HPN-SSH patched version.
For software administration (configuration management), I am using Ansible. I selected Ansible after reading about several alternative configuration management software packages. As far as I can tell, Ansible is the most elegant solution and it’s probably among the easiest to use.
In Ansible’s “hosts” file (/etc/ansible/hosts), I have included the following statement:
[nodes] node-[0:7] |
This allows me to address all of the worker nodes simultaneously, using the group label, “nodes.” For example, the following ad hoc command, displays the system time on all nodes by running the “date” command:
In addition to running ad hoc commands from the command line, it is possible to create rather intricate “playbooks” to automate many tasks easily.
Hadoop
I have installed Hadoop 2.4.1 and I am using Oracle Java 7. The hadoop software is installed in /usr/local/hadoop/. In my .bashrc file, I have the following:
# Set Hadoop-related environment variables export HADOOP_HOME=/usr/local/hadoop export JAVA_HOME=/usr/lib/jvm/java-7-oracle export HADOOP_INSTALL=/usr/local/hadoop export HADOOP_PREFIX=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" unalias fs &> /dev/null alias fs="hadoop fs" unalias hls &> /dev/null alias hls="fs -ls" # Add Hadoop bin/ directory to PATH export PATH=$PATH:$HADOOP_HOME/bin |
The hadoop configuration files in /usr/local/hadoop/etc/hadoop/ look like this:
core-site.xml:
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://foam/</value> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/temp</value> </property> </configuration> |
hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>2</value> <description>replication</description> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///usr/local/hadoop_store/hdfs/namenode</value> <description>namenode directory</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///data/hdfs/datanode</value> <description>datanode directory</description> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> <description>file system block size</description> </property> <property> <name>dfs.datanode.data.dir.perm</name> <value>777</value> </property> </configuration> |
yarn-site.xml:
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>foam:8030</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>foam:8032</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>foam:8088</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>foam:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>foam:8033</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>foam</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>26624</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>26624</value> </property> </configuration> |
mapred-site.xml:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> |
And hadoop-env.sh was modified to contain this:
# The java implementation to use. export JAVA_HOME=/usr/lib/jvm/java-7-oracle |
I am trying to build 64-bit native libraries for Hadoop (the libraries that go in /usr/local/hadoop/lib/native). The standard compiled version that is available for download comes with 32-bit native libraries. I have managed to create a 64-bit libhadoop.so, but not libhdfs.so. Hadoop works without these libraries (it just uses libraries that are already installed with Oracle Java 7), but it evidently runs faster when it uses the Hadoop-specific “native” versions.
MPI and NFS
NFS and the MPICH2 implementation of MPI were installed by following the guide here.
August 9th, 2014 at 3:20 am
Wish to go abroad and live a settled life…We are there to assisit you ….
October 26th, 2015 at 12:48 am
The society members should respect the teachers of the society and choose good and quality teachers for their children’s. Because without the good and quality education the society members cannot able to make their children’s successful.
September 27th, 2016 at 11:42 pm
Hadoop has moved far beyond its beginnings in web indexing and is now used in many businesses for a huge variety of tasks that all share the common theme of lots of variety, Hadoop training in bay area (total space occupied by something) and speed of data both structured and (without rules, schedules, etc.). It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop trainings in bay area with an excellent and real time (teachers/thinking ability). Our Hadoop training in bayarea course is designed as per the current IT industry needed thing. Apache Hadoop is having very good demand in the market, huge number of job openings are there in the IT world.
December 18th, 2016 at 11:08 pm
Well informative posting thanks for sharing,
March 29th, 2018 at 3:32 am
Thank you for sharing! I'm looking for it!
instagram search
June 20th, 2018 at 4:25 am
Progressions in innovation have made life a ton less demanding for every single one of us. This is valid for all parts of our lives, even in the region of our accounts.
July 26th, 2024 at 5:30 am
Amazon Prime: Elevating your viewing pleasure without emptying your wallet.
August 2nd, 2024 at 10:50 pm
I have read some excellent stuff here Definitely value bookmarking for revisiting I wonder how much effort you put to make the sort of excellent informative website
August 2nd, 2024 at 11:35 pm
you are in reality a just right webmaster The site loading velocity is incredible It seems that you are doing any unique trick In addition The contents are masterwork you have performed a wonderful task on this topic
August 3rd, 2024 at 5:19 am
Wonderful beat I wish to apprentice while you amend your web site how could i subscribe for a blog web site The account aided me a acceptable deal I had been a little bit acquainted of this your broadcast provided bright clear idea
August 4th, 2024 at 9:57 pm
Ledger Live
August 5th, 2024 at 2:57 am
My brother recommended I might like this web site He was totally right This post actually made my day You cannt imagine just how much time I had spent for this information Thanks
August 6th, 2024 at 12:31 am
Nice blog here Also your site loads up fast What host are you using Can I get your affiliate link to your host I wish my web site loaded up as quickly as yours lol
August 6th, 2024 at 5:52 am
of course like your website but you have to check the spelling on several of your posts A number of them are rife with spelling issues and I in finding it very troublesome to inform the reality on the other hand I will certainly come back again
August 7th, 2024 at 12:40 am
Somebody essentially help to make significantly articles Id state This is the first time I frequented your web page and up to now I surprised with the research you made to make this actual post incredible Fantastic job
August 15th, 2024 at 7:03 pm
Your blog is a testament to your expertise and dedication to your craft. I’m constantly impressed by the depth of your knowledge and the clarity of your explanations. Keep up the amazing work!
August 23rd, 2024 at 4:10 am
Ny weekly Pretty! This has been a really wonderful post. Many thanks for providing these details.
August 25th, 2024 at 10:44 pm
Mygreat learning Very well presented. Every quote was awesome and thanks for sharing the content. Keep sharing and keep motivating others.
August 28th, 2024 at 5:40 am
I just could not leave your web site before suggesting that I really enjoyed the standard information a person supply to your visitors Is gonna be again steadily in order to check up on new posts
September 10th, 2024 at 4:43 am
Excellent blog here Also your website loads up very fast What web host are you using Can I get your affiliate link to your host I wish my web site loaded up as quickly as yours lol
September 13th, 2024 at 5:02 am
My brother suggested I might like this website He was totally right This post actually made my day You cannt imagine just how much time I had spent for this information Thanks
September 20th, 2024 at 10:44 pm
?????? ???????? ???? ?? ?????? ?? ???? ????? ????? ???? ???? ?????? ???????? ???? ???????? ?? ??????? ??????? ?????? ???? ????? ?? ?????? ??????????. ?????? ???????? ???? ????? ?????? ?????? ????????? ?????????? ??? ???? ??????? ????? ???????? ??????. ?????? ?? ???? ???? ??????? ??????? ?? ??????? ???? ???? ????? ???? ??? ??????????? ???????? ??????? ?????? ??????????? ?????? ?????? ??? ??????? ?????? ???????. ????? ?? ????????? ??? ?????? ???????? ???? ?????? ???? ?????? ?????? ?????????? ??? elitepipeiraq.com. ???????? ????????? ??????? ?????? ?????? ??????? ??????? ?????? ??????? ?? ???? ????? ??????.
September 27th, 2024 at 8:42 am
???????????????? ????????? ????? ?? ??????? ??????? ??????? ? ??????? ?? ???. ?? ??????????: ????????? ?????? ? ?????? ???? ??????? ?????????? ???????? ????????????? ?????? ?????????? ? ??????? ??? ? ??????? ?? ???
September 29th, 2024 at 6:53 am
Kalorifer Sobas? odun, kömür, pelet gibi yak?tlarla çal??an ve ?s?tma i?levi gören bir soba türüdür. Kalorifer Sobas? içindeki yak?t?n yanmas?yla olu?an ?s?y? do?rudan çevresine yayar ve ayn? zamanda suyun ?s?nmas?n? sa?lar.
October 2nd, 2024 at 12:36 am
I was recommended this website by my cousin I am not sure whether this post is written by him as nobody else know such detailed about my trouble You are amazing Thanks