From 3474f6cd067dbaf0977d5bde70b69932a82db3f7 Mon Sep 17 00:00:00 2001 From: Chris Lu Date: Wed, 15 Jul 2020 11:42:20 -0700 Subject: [PATCH] Updated Hadoop Benchmark (markdown) --- Hadoop-Benchmark.md | 140 ++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 128 insertions(+), 12 deletions(-) diff --git a/Hadoop-Benchmark.md b/Hadoop-Benchmark.md index 7a82555..f4afdea 100644 --- a/Hadoop-Benchmark.md +++ b/Hadoop-Benchmark.md @@ -1,4 +1,4 @@ -## Setup Hadoop Benchmark +# Setup Hadoop Benchmark Here are my steps. First, checkout hadoop 2.10.0 binary, untar, and cd in to the hadoop directory. ``` @@ -29,20 +29,136 @@ cd share/hadoop/common/lib/ wget https://oss.sonatype.org/service/local/repositories/releases/content/com/github/chrislusf/seaweedfs-hadoop2-client/1.3.2/seaweedfs-hadoop2-client-1.3.2.jar ``` -Start the TestDFSIO tests: +# TestDFSIO Benchmark + +The TestDFSIO benchmark is used for measuring I/O (read/write) performance. + +Start the TestDFSIO write tests: ``` -bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.10.0-tests.jar TestDFSIO -write -nrFiles 64 -fileSize 16GB -resFile /tmp/TestDFSIOwrite.txt +bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.10.0-tests.jar TestDFSIO -write -nrFiles 16 -fileSize 16GB -resFile /tmp/TestDFSIOwrite.txt + +... +20/07/14 18:27:34 INFO mapreduce.Job: map 100% reduce 100% +20/07/14 18:27:34 INFO mapreduce.Job: Job job_local485352420_0001 completed successfully +20/07/14 18:27:34 INFO mapreduce.Job: Counters: 35 + File System Counters + FILE: Number of bytes read=27928633 + FILE: Number of bytes written=35966385 + FILE: Number of read operations=0 + FILE: Number of large read operations=0 + FILE: Number of write operations=0 + SEAWEEDFS: Number of bytes read=17111 + SEAWEEDFS: Number of bytes written=2611340146618 + SEAWEEDFS: Number of read operations=0 + SEAWEEDFS: Number of large read operations=0 + SEAWEEDFS: Number of write operations=0 + Map-Reduce Framework + Map input records=16 + Map output records=80 + Map output bytes=1276 + Map output materialized bytes=1532 + Input split bytes=2054 + Combine input records=0 + Combine output records=0 + Reduce input groups=5 + Reduce shuffle bytes=1532 + Reduce input records=80 + Reduce output records=5 + Spilled Records=160 + Shuffled Maps =16 + Failed Shuffles=0 + Merged Map outputs=16 + GC time elapsed (ms)=151632 + Total committed heap usage (bytes)=39777730560 + Shuffle Errors + BAD_ID=0 + CONNECTION=0 + IO_ERROR=0 + WRONG_LENGTH=0 + WRONG_MAP=0 + WRONG_REDUCE=0 + File Input Format Counters + Bytes Read=1798 + File Output Format Counters + Bytes Written=84 +20/07/14 18:27:34 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write +20/07/14 18:27:34 INFO fs.TestDFSIO: Date & time: Tue Jul 14 18:27:34 PDT 2020 +20/07/14 18:27:34 INFO fs.TestDFSIO: Number of files: 16 +20/07/14 18:27:34 INFO fs.TestDFSIO: Total MBytes processed: 262144 +20/07/14 18:27:34 INFO fs.TestDFSIO: Throughput mb/sec: 310.47 +20/07/14 18:27:34 INFO fs.TestDFSIO: Average IO rate mb/sec: 315.63 +20/07/14 18:27:34 INFO fs.TestDFSIO: IO rate std deviation: 43.43 +20/07/14 18:27:34 INFO fs.TestDFSIO: Test exec time sec: 847.32 +20/07/14 18:27:34 INFO fs.TestDFSIO: +``` + +Start the TestDFSIO read tests: + +``` +bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.10.0-tests.jar TestDFSIO -read -nrFiles 64 -fileSize 16GB -resFile /tmp/TestDFSIOwrite.txt ... -20/07/14 10:41:02 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write -20/07/14 10:41:02 INFO fs.TestDFSIO: Date & time: Tue Jul 14 10:41:02 PDT 2020 -20/07/14 10:41:02 INFO fs.TestDFSIO: Number of files: 64 -20/07/14 10:41:02 INFO fs.TestDFSIO: Total MBytes processed: 1048576 -20/07/14 10:41:02 INFO fs.TestDFSIO: Throughput mb/sec: 381.28 -20/07/14 10:41:02 INFO fs.TestDFSIO: Average IO rate mb/sec: 383.42 -20/07/14 10:41:02 INFO fs.TestDFSIO: IO rate std deviation: 28.81 -20/07/14 10:41:02 INFO fs.TestDFSIO: Test exec time sec: 2756.42 -20/07/14 10:41:02 INFO fs.TestDFSIO: +20/07/14 21:48:00 INFO mapreduce.Job: Counters: 35 + File System Counters + FILE: Number of bytes read=27928585 + FILE: Number of bytes written=36004955 + FILE: Number of read operations=0 + FILE: Number of large read operations=0 + FILE: Number of write operations=0 + SEAWEEDFS: Number of bytes read=2611340133079 + SEAWEEDFS: Number of bytes written=30649 + SEAWEEDFS: Number of read operations=0 + SEAWEEDFS: Number of large read operations=0 + SEAWEEDFS: Number of write operations=0 + Map-Reduce Framework + Map input records=16 + Map output records=80 + Map output bytes=1252 + Map output materialized bytes=1508 + Input split bytes=2054 + Combine input records=0 + Combine output records=0 + Reduce input groups=5 + Reduce shuffle bytes=1508 + Reduce input records=80 + Reduce output records=5 + Spilled Records=160 + Shuffled Maps =16 + Failed Shuffles=0 + Merged Map outputs=16 + GC time elapsed (ms)=145687 + Total committed heap usage (bytes)=38852886528 + Shuffle Errors + BAD_ID=0 + CONNECTION=0 + IO_ERROR=0 + WRONG_LENGTH=0 + WRONG_MAP=0 + WRONG_REDUCE=0 + File Input Format Counters + Bytes Read=1798 + File Output Format Counters + Bytes Written=83 +20/07/14 21:48:00 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read +20/07/14 21:48:00 INFO fs.TestDFSIO: Date & time: Tue Jul 14 21:48:00 PDT 2020 +20/07/14 21:48:00 INFO fs.TestDFSIO: Number of files: 16 +20/07/14 21:48:00 INFO fs.TestDFSIO: Total MBytes processed: 262144 +20/07/14 21:48:00 INFO fs.TestDFSIO: Throughput mb/sec: 22.14 +20/07/14 21:48:00 INFO fs.TestDFSIO: Average IO rate mb/sec: 22.91 +20/07/14 21:48:00 INFO fs.TestDFSIO: IO rate std deviation: 3.79 +20/07/14 21:48:00 INFO fs.TestDFSIO: Test exec time sec: 11871.4 +20/07/14 21:48:00 INFO fs.TestDFSIO: + +``` + +# MRbench Benchmark + +``` +bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.10.0-tests.jar mrbench -inputLines 10000000 -inputType random -maps 10 -reduces 5 + +... + + ``` \ No newline at end of file