diff --git a/Hadoop-Benchmark.md b/Hadoop-Benchmark.md new file mode 100644 index 0000000..7a82555 --- /dev/null +++ b/Hadoop-Benchmark.md @@ -0,0 +1,48 @@ +## Setup Hadoop Benchmark + +Here are my steps. First, checkout hadoop 2.10.0 binary, untar, and cd in to the hadoop directory. +``` +wget http://apache.mirrors.hoobly.com/hadoop/common/hadoop-2.10.0/hadoop-2.10.0.tar.gz +tar xvf hadoop-2.10.0.tar.gz +cd hadoop-2.10.0 +``` + +Modify the file `./etc/hadoop/core-site.xml` + +``` + + + fs.seaweedfs.impl + seaweed.hdfs.SeaweedFileSystem + + + fs.defaultFS + seaweedfs://localhost:8888 + + +``` + +Then get the seaweedfs hadoop client jar. + +``` +cd share/hadoop/common/lib/ +wget https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/1.3.2/seaweedfs-hadoop2-client-1.3.2.jar +``` + +Start the TestDFSIO tests: + +``` +bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.10.0-tests.jar TestDFSIO -write -nrFiles 64 -fileSize 16GB -resFile /tmp/TestDFSIOwrite.txt + +... + +20/07/14 10:41:02 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write +20/07/14 10:41:02 INFO fs.TestDFSIO: Date & time: Tue Jul 14 10:41:02 PDT 2020 +20/07/14 10:41:02 INFO fs.TestDFSIO: Number of files: 64 +20/07/14 10:41:02 INFO fs.TestDFSIO: Total MBytes processed: 1048576 +20/07/14 10:41:02 INFO fs.TestDFSIO: Throughput mb/sec: 381.28 +20/07/14 10:41:02 INFO fs.TestDFSIO: Average IO rate mb/sec: 383.42 +20/07/14 10:41:02 INFO fs.TestDFSIO: IO rate std deviation: 28.81 +20/07/14 10:41:02 INFO fs.TestDFSIO: Test exec time sec: 2756.42 +20/07/14 10:41:02 INFO fs.TestDFSIO: +``` \ No newline at end of file