티스토리 뷰

Hadoop & Mapreduce

HDFS 벤치마킹 하기

윤's군 2013. 10. 21. 17:23

hadoop-test-*.jar 파일을 이용하여 테스트를 하게 된다


쓰기 테스트

$bin/hadoop jar hadoop-test-1.2.1.jar TestDFSIO -write -nrFiles 5 -fileSize 100


-write 옵션은 쓰기 성능을 벤치마킹한다.

-nrFiles 옵션은 몇개의 파일을 생성할 것인가 하는 옵션이다.

-fileSize 옵션은 메가 바이트 단위로(MBytes)로 파일을 생성하겠다냐는 옵션이다.



우분투 13.04, 자바 1.7.40 hadoop 1.2.1버전, 가상분산환경에서 테스트 하였다.


 bin/hadoop jar hadoop-test-1.2.1.jar TestDFSIO -write -nrFiles 5 -fileSize 100

Warning: $HADOOP_HOME is deprecated.


TestDFSIO.0.0.4

13/10/21 17:09:44 INFO fs.TestDFSIO: nrFiles = 5

13/10/21 17:09:44 INFO fs.TestDFSIO: fileSize (MB) = 100

13/10/21 17:09:44 INFO fs.TestDFSIO: bufferSize = 1000000

13/10/21 17:09:45 INFO fs.TestDFSIO: creating control file: 100 mega bytes, 5 files

13/10/21 17:09:45 INFO fs.TestDFSIO: created control files for: 5 files

13/10/21 17:09:45 INFO mapred.FileInputFormat: Total input paths to process : 5

13/10/21 17:09:46 INFO mapred.JobClient: Running job: job_201310211707_0001

13/10/21 17:09:47 INFO mapred.JobClient:  map 0% reduce 0%

13/10/21 17:09:55 INFO mapred.JobClient:  map 20% reduce 0%

13/10/21 17:09:56 INFO mapred.JobClient:  map 40% reduce 0%

13/10/21 17:10:02 INFO mapred.JobClient:  map 80% reduce 0%

13/10/21 17:10:06 INFO mapred.JobClient:  map 100% reduce 0%

13/10/21 17:10:11 INFO mapred.JobClient:  map 100% reduce 100%

13/10/21 17:10:12 INFO mapred.JobClient: Job complete: job_201310211707_0001

13/10/21 17:10:12 INFO mapred.JobClient: Counters: 30

13/10/21 17:10:12 INFO mapred.JobClient:   Job Counters 

13/10/21 17:10:12 INFO mapred.JobClient:     Launched reduce tasks=1

13/10/21 17:10:12 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=28927

13/10/21 17:10:12 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

13/10/21 17:10:12 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

13/10/21 17:10:12 INFO mapred.JobClient:     Launched map tasks=5

13/10/21 17:10:12 INFO mapred.JobClient:     Data-local map tasks=5

13/10/21 17:10:12 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=15903

13/10/21 17:10:12 INFO mapred.JobClient:   File Input Format Counters 

13/10/21 17:10:12 INFO mapred.JobClient:     Bytes Read=560

13/10/21 17:10:12 INFO mapred.JobClient:   File Output Format Counters 

13/10/21 17:10:12 INFO mapred.JobClient:     Bytes Written=76

13/10/21 17:10:12 INFO mapred.JobClient:   FileSystemCounters

13/10/21 17:10:12 INFO mapred.JobClient:     FILE_BYTES_READ=428

13/10/21 17:10:12 INFO mapred.JobClient:     HDFS_BYTES_READ=1180

13/10/21 17:10:12 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=343260

13/10/21 17:10:12 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=524288076

13/10/21 17:10:12 INFO mapred.JobClient:   Map-Reduce Framework

13/10/21 17:10:12 INFO mapred.JobClient:     Map output materialized bytes=452

13/10/21 17:10:12 INFO mapred.JobClient:     Map input records=5

13/10/21 17:10:12 INFO mapred.JobClient:     Reduce shuffle bytes=452

13/10/21 17:10:12 INFO mapred.JobClient:     Spilled Records=50

13/10/21 17:10:12 INFO mapred.JobClient:     Map output bytes=372

13/10/21 17:10:12 INFO mapred.JobClient:     Total committed heap usage (bytes)=901251072

13/10/21 17:10:12 INFO mapred.JobClient:     CPU time spent (ms)=7590

13/10/21 17:10:12 INFO mapred.JobClient:     Map input bytes=130

13/10/21 17:10:12 INFO mapred.JobClient:     SPLIT_RAW_BYTES=620

13/10/21 17:10:12 INFO mapred.JobClient:     Combine input records=0

13/10/21 17:10:12 INFO mapred.JobClient:     Reduce input records=25

13/10/21 17:10:12 INFO mapred.JobClient:     Reduce input groups=5

13/10/21 17:10:12 INFO mapred.JobClient:     Combine output records=0

13/10/21 17:10:12 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1018044416

13/10/21 17:10:12 INFO mapred.JobClient:     Reduce output records=5

13/10/21 17:10:12 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2375225344

13/10/21 17:10:12 INFO mapred.JobClient:     Map output records=25

13/10/21 17:10:12 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write

13/10/21 17:10:12 INFO fs.TestDFSIO:            Date & time: Mon Oct 21 17:10:12 KST 2013

13/10/21 17:10:12 INFO fs.TestDFSIO:        Number of files: 5

13/10/21 17:10:12 INFO fs.TestDFSIO: Total MBytes processed: 500

13/10/21 17:10:12 INFO fs.TestDFSIO:      Throughput mb/sec: 43.8288920056101

13/10/21 17:10:12 INFO fs.TestDFSIO: Average IO rate mb/sec: 44.16706466674805

13/10/21 17:10:12 INFO fs.TestDFSIO:  IO rate std deviation: 3.8464819169416016

13/10/21 17:10:12 INFO fs.TestDFSIO:     Test exec time sec: 27.048

13/10/21 17:10:12 INFO fs.TestDFSIO: 



읽기 테스트

$bin/hadoop jar hadoop-test-1.2.1.jar TestDFSIO -read -nrFiles 5 -fileSize 100


hadoop@hadoop-VirtualBox:/usr/local/hadoop-1.2.1$ bin/hadoop jar hadoop-test-1.2.1.jar TestDFSIO -read -nrFiles 5 -fileSize 100

Warning: $HADOOP_HOME is deprecated.


TestDFSIO.0.0.4

13/10/21 17:22:13 INFO fs.TestDFSIO: nrFiles = 5

13/10/21 17:22:14 INFO fs.TestDFSIO: fileSize (MB) = 100

13/10/21 17:22:14 INFO fs.TestDFSIO: bufferSize = 1000000

13/10/21 17:22:14 INFO fs.TestDFSIO: creating control file: 100 mega bytes, 5 files

13/10/21 17:22:14 INFO fs.TestDFSIO: created control files for: 5 files

13/10/21 17:22:14 INFO mapred.FileInputFormat: Total input paths to process : 5

13/10/21 17:22:15 INFO mapred.JobClient: Running job: job_201310211707_0002

13/10/21 17:22:16 INFO mapred.JobClient:  map 0% reduce 0%

13/10/21 17:22:24 INFO mapred.JobClient:  map 40% reduce 0%

13/10/21 17:22:30 INFO mapred.JobClient:  map 60% reduce 0%

13/10/21 17:22:31 INFO mapred.JobClient:  map 80% reduce 0%

13/10/21 17:22:34 INFO mapred.JobClient:  map 100% reduce 26%

13/10/21 17:22:41 INFO mapred.JobClient:  map 100% reduce 100%

13/10/21 17:22:41 INFO mapred.JobClient: Job complete: job_201310211707_0002

13/10/21 17:22:41 INFO mapred.JobClient: Counters: 30

13/10/21 17:22:41 INFO mapred.JobClient:   Job Counters 

13/10/21 17:22:41 INFO mapred.JobClient:     Launched reduce tasks=1

13/10/21 17:22:41 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=29659

13/10/21 17:22:41 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

13/10/21 17:22:41 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

13/10/21 17:22:41 INFO mapred.JobClient:     Launched map tasks=5

13/10/21 17:22:41 INFO mapred.JobClient:     Data-local map tasks=5

13/10/21 17:22:41 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=16969

13/10/21 17:22:41 INFO mapred.JobClient:   File Input Format Counters 

13/10/21 17:22:41 INFO mapred.JobClient:     Bytes Read=560

13/10/21 17:22:41 INFO mapred.JobClient:   File Output Format Counters 

13/10/21 17:22:41 INFO mapred.JobClient:     Bytes Written=77

13/10/21 17:22:41 INFO mapred.JobClient:   FileSystemCounters

13/10/21 17:22:41 INFO mapred.JobClient:     FILE_BYTES_READ=432

13/10/21 17:22:41 INFO mapred.JobClient:     HDFS_BYTES_READ=524289180

13/10/21 17:22:41 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=343256

13/10/21 17:22:41 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=77

13/10/21 17:22:41 INFO mapred.JobClient:   Map-Reduce Framework

13/10/21 17:22:41 INFO mapred.JobClient:     Map output materialized bytes=456

13/10/21 17:22:41 INFO mapred.JobClient:     Map input records=5

13/10/21 17:22:41 INFO mapred.JobClient:     Reduce shuffle bytes=456

13/10/21 17:22:41 INFO mapred.JobClient:     Spilled Records=50

13/10/21 17:22:41 INFO mapred.JobClient:     Map output bytes=376

13/10/21 17:22:41 INFO mapred.JobClient:     Total committed heap usage (bytes)=755499008

13/10/21 17:22:41 INFO mapred.JobClient:     CPU time spent (ms)=5390

13/10/21 17:22:41 INFO mapred.JobClient:     Map input bytes=130

13/10/21 17:22:41 INFO mapred.JobClient:     SPLIT_RAW_BYTES=620

13/10/21 17:22:41 INFO mapred.JobClient:     Combine input records=0

13/10/21 17:22:41 INFO mapred.JobClient:     Reduce input records=25

13/10/21 17:22:41 INFO mapred.JobClient:     Reduce input groups=5

13/10/21 17:22:41 INFO mapred.JobClient:     Combine output records=0

13/10/21 17:22:41 INFO mapred.JobClient:     Physical memory (bytes) snapshot=859947008

13/10/21 17:22:41 INFO mapred.JobClient:     Reduce output records=5

13/10/21 17:22:41 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2361077760

13/10/21 17:22:41 INFO mapred.JobClient:     Map output records=25

13/10/21 17:22:41 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read

13/10/21 17:22:41 INFO fs.TestDFSIO:            Date & time: Mon Oct 21 17:22:41 KST 2013

13/10/21 17:22:41 INFO fs.TestDFSIO:        Number of files: 5

13/10/21 17:22:41 INFO fs.TestDFSIO: Total MBytes processed: 500

13/10/21 17:22:41 INFO fs.TestDFSIO:      Throughput mb/sec: 53.15190815350271

13/10/21 17:22:41 INFO fs.TestDFSIO: Average IO rate mb/sec: 64.6701889038086

13/10/21 17:22:41 INFO fs.TestDFSIO:  IO rate std deviation: 35.92993218085322

13/10/21 17:22:41 INFO fs.TestDFSIO:     Test exec time sec: 26.753

13/10/21 17:22:41 INFO fs.TestDFSIO: 



댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/05   »
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
글 보관함