Hadoop: Menjalankan MapReduce Job -WordCount

From OnnoWiki
Jump to navigation Jump to search

Sumber: http://wiki.apache.org/hadoop/WordCount

Buat File Data Sederhana

cd ~
touch file01
touch file02
echo "Hello World Bye World" > file01
echo "Hello Hadoop Goodbye Hadoop" > file02
hadoop fs -mkdir /user/hduser/input
hadoop fs -put file* /user/hduser/input/

Cek

hadoop fs -ls /user/hduser/input/
Found 2 items
-rw-r--r--   1 hduser supergroup         22 2015-11-09 17:28 /user/hduser/input/file01
-rw-r--r--   1 hduser supergroup         28 2015-11-09 17:28 /user/hduser/input/file02
hadoop fs -cat /user/hduser/input/file01
Hello World Bye World
hadoop fs -cat /usr/hduser/input/file02
Hello Hadoop Goodbye Hadoop 

Jalankan WordCount

Contoh

cd /usr/local/hadoop
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount [-m <#maps>] [-r <#reducers>] <in-dir> <out-dir> 

Jalankan

cd /usr/local/hadoop
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /user/hduser/input /user/hduser/output

Copy Hasil

cd ~
hadoop fs -copyToLocal /user/hduser/output .
more output/part-r-*

Hasilnya

Bye	1
Goodbye	1
Hadoop	2
Hello	2
World	2


Ujicoba menggunakan dataset 10Gbyte

Buat dataset

cd /usr/local/hadoop
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar randomtextwriter /random-text-data

Analisa

hadoop fs -rm /user/hduser/output/*
hadoop fs -rmdir /user/hduser/output
cd /usr/local/hadoop
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /random-text-data /user/hduser/output

Referensi