Hadoop
Jump to navigation
Jump to search
Installation
Download the tarball (hadoop-2.4.1.tar.gz) from Hadoop website.
tar xzvf hadoop-2.4.1.tar.gz
Make sure JAVA_HOME is set to a Java installation. If it is not available, we can include it by editing etc/hadoop/hadoop-env.sh and specify the JAVA_HOME variable. For example
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
It is convenient to put the Hadoop binary directory on your command-line path. For example, I append the following 2 lines to ~/.bashrc file.
export HADOOP_INSTALL=/home/brb/hadoop-2.4.1 export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
Check that Hadoop runs by typing
hadoop version
Public datasets
- http://stackoverflow.com/questions/12915128/small-data-sets-for-hadoop-mapreduce
- NOAA weather data from 1901 to 2014.
Books
Hadoop: The Definitive Guide