CSE 490H: Scalable Systems: Design, Implementation and Use of Large Scale Clusters, Autumn 2008
Hadoop is the name of the distributed system we will
be programming against.
Our cluster is running Hadoop 0.18.1.
for Hadoop is also available in the above links.
You will require a copy of Hadoop on your local
development machine for compilation purposes.
The special Hadoop version has been disabled. If
you have switched to this, you must switch (back)
to Hadoop 0.18.1.
If you are using the submission node, you should
execute commands against
We have a 40 node Hadoop cluster for our use during
this course. To get access to this cluster, follow
the instructions at
This page will get you on board. Also, assignment 1 on
the projects page contains
more step-by-step information as to how to get access
and log in.
If you are using your own machine and would like to directly
connect to the cluster, you can configure Hadoop to do so. Download
the hadoop-site.xml file here.
(Right-click and then select "save as...")
The instructions on how to prepare this file with the rest of your
setup are in assignment 1 on the projects page.
In addition to the hadoop-site.xml file, you will need to configure
your hosts file (e.g., /etc/hosts on a linux machine). The
line that must be added to your hosts file for the master node
10.1.133.3 XenHost-00096B63736D-1 XenHost-00096B63736D-1.internal
Hadoop Official Website (downloads, faq, wiki, api docs, etc)
Cluster Real-Time Info
The following links require you to have your
proxy connection set up through the gateway. See
instructions in project 1.
Job tracking server: http://10.1.133.3:50030/
DFS NameNode status: http://10.1.133.3:50070/
||Computer Science & Engineering|
University of Washington
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
to lazowska @ cs]