steup a single Hadoop 2.4 on Mac OS X 10.9.3
                支 持 本 站: 捐赠服务器等运维费用,需要您的支持!
                  
支 持 本 站: 捐赠服务器等运维费用,需要您的支持! 支 持 本 站: 捐赠服务器等运维费用,需要您的支持!
                
              
            steup a single Hadoop 2.4 on Mac OS X 10.9.3
install
- brew install hadoop
Setup passphraseless ssh
- try ssh localhost
- $ ssh-keygen -t dsa -P ” -f ~/.ssh/id_dsa
- $ cat ~/.ssh/iddsa.pub >> ~/.ssh/authorizedkeys
Environment
- check /usr/local/Cellar/hadoop/2.4.0/libexec/etc/hadoop/hadoop-env.sh
export JAVA_HOME="$(/usr/libexec/java_home)"
- cd /usr/local/Cellar/hadoop/2.4.0
- try bin/hadoop
$ bin/hadoop version Hadoop 2.4.0 Subversion http://svn.apache.org/repos/asf/hadoop/common -r 1583262 Compiled by jenkins on 2014-03-31T08:29Z Compiled with protoc 2.5.0 From source with checksum 375b2832a6641759c6eaf6e3e998147 This command was run using /usr/local/Cellar/hadoop/2.4.0/libexec/share/hadoop/common/hadoop-common-2.4.0.jar</li>
try Standalone mode
- cd /usr/local/Cellar/hadoop/2.4.0
- mkdir input
- cp libexec/etc/hadoop/*.xml input
- bin/hadoop jar libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output 'dfs[a-z]+'
- cat output/*
try Pseudo-Distributed mode
- vi libexec/etc/hadoop/core-site.xml - <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
- vi libexec/etc/hadoop/hdfs-site.xml - <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
run MapReduce job locally
hdfs file system
- rm -fr /tmp/hadoop-username; rm -fr /private/tmp/hadoop-username
- Format the filesystem: 
 $ bin/hdfs namenode -format
 “INFO common.Storage: Storage directory /tmp/hadoop-username/dfs/name has been successfully formatted.”“
start daemon
- Start NameNode daemon and DataNode daemon: 
 $ sbin/start-dfs.sh
 Check java processes with org.apache.hadoop.hdfs.server.namenode.NameNode & org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.
 Check log withls -lstr libexec/logs/
 Check http://localhost:9000/
- Browse the web interface for the NameNode; by default it is available at: 
 NameNode - http://localhost:50070/
hdfs command
- Make the HDFS directories required to execute MapReduce jobs: 
 $ bin/hdfs dfs -mkdir /user
 $ bin/hdfs dfs -mkdir /user/username
 $ bin/hdfs dfs -mkdir /user/username/input
 $ bin/hdfs dfs -ls /user/
 $ jps
 29398 Jps
 25959 DataNode
 25839 NameNode
 26109 SecondaryNameNode
run mapreduce
- Copy the input files into the distributed filesystem: 
 $ bin/hdfs dfs -put etc/hadoop input
- Run some of the examples provided: 
 $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
- Examine the output files: 
 Copy the output files from the distributed filesystem to the local filesystem and examine them:
 $ bin/hdfs dfs -get output output
 $ cat output/*
 $ bin/hdfs dfs -cat output/*
stop hdfs
- stop hdfs 
 $ sbin/stop-dfs.sh
run MapReduce job on YARN
start hdfs
- sbin/start-dfs.sh
- bin/hdfs dfs -rm -r output
- bin/hdfs dfs -rm -r input
config yarn
- etc/hadoop/mapred-site.xml: - <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
- etc/hadoop/yarn-site.xml: - <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
Start ResourceManager daemon and NodeManager daemon:
- sbin/start-yarn.sh
- jps 
 99082 SecondaryNameNode
 98803 NameNode
 99215 Jps
 97753 NodeManager
 97649 ResourceManager
 98929 DataNode
- Browse the web interface for the ResourceManager; by default it is available at: 
 ResourceManager - http://localhost:8088/
run a mapreduce
- bin/hdfs dfs -put libexec/etc/hadoop input
- bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar grep input output ‘dfs[a-z.]+’
- bin/hdfs dfs -cat /user/yinlei/output/part-r-00000
4 dfs.class 4 dfs.audit.logger 3 dfs.server.namenode. 2 dfs.audit.log.maxbackupindex 2 dfs.period 2 dfs.audit.log.maxfilesize 1 dfsmetrics.log 1 dfsadmin 1 dfs.servers 1 dfs.replication 1 dfs.file 1 dfs.data.dir 1 dfs.name.dir</li>
支 持 本 站: 捐赠服务器等运维费用,需要您的支持! 支 持 本 站: 捐赠服务器等运维费用,需要您的支持!
留言簿