Difference between revisions of "Hadoop: Sampel Dataset untuk test Hadoop"
Onnowpurbo (talk | contribs) |
Onnowpurbo (talk | contribs) |
||
Line 9: | Line 9: | ||
From clearbits.net, you can get quarterly full data set of stack exchange so that you can use it while you are practising the hadoop . it contains around 10 GB data. | From clearbits.net, you can get quarterly full data set of stack exchange so that you can use it while you are practising the hadoop . it contains around 10 GB data. | ||
− | == | + | ==grouplens.org== |
− | + | Dataset bisa di ambil di | |
− | + | http://grouplens.org/datasets/movielens/ | |
==hadoop-examples.jar randomwriter /random-data== | ==hadoop-examples.jar randomwriter /random-data== |
Revision as of 18:51, 9 November 2015
Sumber: http://www.hadooplessons.info/2013/06/data-sets-for-practicing-hadoop.html
To practise Hadoop you can use below ways to generate the big data (GB),So that you can get the real feel/power of the Hadoop.
1.clearbits.net
From clearbits.net, you can get quarterly full data set of stack exchange so that you can use it while you are practising the hadoop . it contains around 10 GB data.
grouplens.org
Dataset bisa di ambil di
http://grouplens.org/datasets/movielens/
hadoop-examples.jar randomwriter /random-data
cd /usr/local/hadoop hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar randomwriter /random-data
Membuat 10 GB data per node didalam folder /random-data di HDFS. Butuh waktu tidak sampai 1 menit di harddisk WD Red 6TB.
hadoop-examples.jar randomtextwriter /random-text-data
cd /usr/local/hadoop hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar randomtextwriter /random-text-data
Membuat 10 GB text data per node dibawah folder /random-text-data di HDFS Butuh waktu tidak sampai 1 menit di harddisk WD Red 6TB.
5. Amazon
provides so many data sets ,you can use them.
6. Stackoverflow
Check answers of the same question on stackoverflow
7. University of Waitako
many data sets available for practicing machine learning.
8. Quora
See answers for the similar question on Quora.
If you know any free data sets ,please share in comments