[cloudera@quickstart ~]$ wget http://tiny.cloudera.com/hadoopTutorialSample --2018-04-13 16:25:31-- http://tiny.cloudera.com/hadoopTutorialSample Resolving proxy.fing.edu.uy... 164.73.32.12, 164.73.32.11 Connecting to proxy.fing.edu.uy|164.73.32.12|:3128... connected. Proxy request sent, awaiting response... 302 Found Location: http://www.cloudera.com/content/cloudera/en/documentation/shared/samples/hadoopTutorial.tar.gz [following] --2018-04-13 16:25:32-- http://www.cloudera.com/content/cloudera/en/documentation/shared/samples/hadoopTutorial.tar.gz Connecting to proxy.fing.edu.uy|164.73.32.12|:3128... connected. Proxy request sent, awaiting response... 301 Moved Permanently Location: https://www.cloudera.com/content/cloudera/en/documentation/shared/samples/hadoopTutorial.tar.gz [following] --2018-04-13 16:25:33-- https://www.cloudera.com/content/cloudera/en/documentation/shared/samples/hadoopTutorial.tar.gz Connecting to proxy.fing.edu.uy|164.73.32.12|:3128... connected. Proxy request sent, awaiting response... 301 Moved Permanently Location: https://www.cloudera.com/content/www/en-us/documentation/other/shared/samples/hadoopTutorial.tar.gz [following] --2018-04-13 16:25:34-- https://www.cloudera.com/content/www/en-us/documentation/other/shared/samples/hadoopTutorial.tar.gz Connecting to proxy.fing.edu.uy|164.73.32.12|:3128... connected. Proxy request sent, awaiting response... 301 Moved Permanently Location: http://www.cloudera.com/documentation/other/shared/samples/hadoopTutorial.tar.gz [following] --2018-04-13 16:25:34-- http://www.cloudera.com/documentation/other/shared/samples/hadoopTutorial.tar.gz Connecting to proxy.fing.edu.uy|164.73.32.12|:3128... connected. Proxy request sent, awaiting response... 200 OK Length: 19214 (19K) [application/x-gzip] Saving to: “hadoopTutorialSample” 100%[========================================================================>] 19,214 87.3K/s in 0.2s 2018-04-13 16:25:35 (87.3 KB/s) - “hadoopTutorialSample” saved [19214/19214] [cloudera@quickstart ~]$ ls hadoopTutorialSample hadoopTutorialSample [cloudera@quickstart ~]$ mv hadoopTutorialSample hadoopTutorialSample.tar.gz [cloudera@quickstart ~]$ gzip -d hadoopTutorialSample.tar.gz [cloudera@quickstart ~]$ tar -xvf hadoopTutorialSample.tar hadoop_tutorial/ hadoop_tutorial/WordCount3/ hadoop_tutorial/WordCount2/ hadoop_tutorial/WordCount1/ hadoop_tutorial/WordCount3/build/ hadoop_tutorial/WordCount2/build/ hadoop_tutorial/WordCount1/build/ hadoop_tutorial/WordCount3/build/org/ hadoop_tutorial/WordCount2/build/org/ hadoop_tutorial/WordCount1/build/org/ hadoop_tutorial/WordCount3/build/org/myorg/ hadoop_tutorial/WordCount2/build/org/myorg/ hadoop_tutorial/WordCount1/build/org/myorg/ hadoop_tutorial/WordCount3/WordCount.java hadoop_tutorial/WordCount3/wordcount.jar hadoop_tutorial/WordCount3/Makefile hadoop_tutorial/WordCount3/stop_words.text hadoop_tutorial/WordCount2/Makefile hadoop_tutorial/WordCount2/wordcount.jar hadoop_tutorial/WordCount2/WordCount.java hadoop_tutorial/WordCount1/wordcount.jar hadoop_tutorial/WordCount1/Makefile hadoop_tutorial/WordCount1/file2 hadoop_tutorial/WordCount1/file1 hadoop_tutorial/WordCount1/file0 hadoop_tutorial/WordCount1/WordCount.java hadoop_tutorial/WordCount3/build/org/myorg/WordCount.class hadoop_tutorial/WordCount3/build/org/myorg/WordCount$Reduce.class hadoop_tutorial/WordCount3/build/org/myorg/WordCount$Map.class hadoop_tutorial/WordCount2/build/org/myorg/WordCount.class hadoop_tutorial/WordCount2/build/org/myorg/WordCount$Reduce.class hadoop_tutorial/WordCount2/build/org/myorg/WordCount$Map.class hadoop_tutorial/WordCount1/build/org/myorg/WordCount.class hadoop_tutorial/WordCount1/build/org/myorg/WordCount$Reduce.class hadoop_tutorial/WordCount1/build/org/myorg/WordCount$Map.class [cloudera@quickstart ~]$ cp hadoop_tutorial/WordCount1/WordCount.java . [cloudera@quickstart ~]$ mkdir -p build [cloudera@quickstart ~]$ javac -cp /usr/lib/hadoop/*:/usr/lib/hadoop-mapreduce/* WordCount.java -d build -Xlint warning: [path] bad path element "/usr/lib/hadoop-mapreduce/jaxb-api.jar": no such file or directory warning: [path] bad path element "/usr/lib/hadoop-mapreduce/activation.jar": no such file or directory warning: [path] bad path element "/usr/lib/hadoop-mapreduce/jsr173_1.0_api.jar": no such file or directory warning: [path] bad path element "/usr/lib/hadoop-mapreduce/jaxb1-impl.jar": no such file or directory 4 warnings [cloudera@quickstart ~]$ jar -cvf wordcount.jar -C build/ . added manifest adding: org/(in = 0) (out= 0)(stored 0%) adding: org/myorg/(in = 0) (out= 0)(stored 0%) adding: org/myorg/WordCount.class(in = 1985) (out= 989)(deflated 50%) adding: org/myorg/WordCount$Reduce.class(in = 1647) (out= 692)(deflated 57%) adding: org/myorg/WordCount$Map.class(in = 2209) (out= 986)(deflated 55%) [cloudera@quickstart ejemplosLibro]$ sudo su hdfs bash-4.1$ hadoop fs -mkdir /user/ruso bash-4.1$ hadoop fs -chown cloudera /user/ruso bash-4.1$ exit exit [cloudera@quickstart ejemplosLibro]$ sudo su cloudera [cloudera@quickstart ejemplosLibro]$ hadoop fs -mkdir /user/ruso/wordcount /user/ruso/wordcount/input [cloudera@quickstart ejemplosLibro]$ echo "Hadoop is an elephant" > file0 [cloudera@quickstart ejemplosLibro]$ echo "Hadoop is as yellow as can be" > file1 [cloudera@quickstart ejemplosLibro]$ echo "Oh what a yellow fellow is Hadoop" > file2 [cloudera@quickstart ejemplosLibro]$ hadoop fs -put file* /user/ruso/wordcount/input [cloudera@quickstart ~]$ hadoop jar wordcount.jar org.myorg.WordCount /user/ruso/wordcount/input /user/ruso/wordcount/output 18/04/13 16:29:53 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032 18/04/13 16:29:53 INFO input.FileInputFormat: Total input paths to process : 3 18/04/13 16:29:54 INFO mapreduce.JobSubmitter: number of splits:3 18/04/13 16:29:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1522688691245_0040 18/04/13 16:29:54 INFO impl.YarnClientImpl: Submitted application application_1522688691245_0040 18/04/13 16:29:54 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1522688691245_0040/ 18/04/13 16:29:54 INFO mapreduce.Job: Running job: job_1522688691245_0040 18/04/13 16:30:00 INFO mapreduce.Job: Job job_1522688691245_0040 running in uber mode : false 18/04/13 16:30:00 INFO mapreduce.Job: map 0% reduce 0% 18/04/13 16:30:06 INFO mapreduce.Job: map 67% reduce 0% 18/04/13 16:30:11 INFO mapreduce.Job: map 100% reduce 0% 18/04/13 16:30:17 INFO mapreduce.Job: map 100% reduce 100% 18/04/13 16:30:18 INFO mapreduce.Job: Job job_1522688691245_0040 completed successfully 18/04/13 16:30:18 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=140 FILE: Number of bytes written=513853 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=470 HDFS: Number of bytes written=80 HDFS: Number of read operations=12 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=3 Launched reduce tasks=1 Data-local map tasks=3 Total time spent by all maps in occupied slots (ms)=6011904 Total time spent by all reduces in occupied slots (ms)=1513984 Total time spent by all map tasks (ms)=11742 Total time spent by all reduce tasks (ms)=2957 Total vcore-milliseconds taken by all map tasks=11742 Total vcore-milliseconds taken by all reduce tasks=2957 Total megabyte-milliseconds taken by all map tasks=6011904 Total megabyte-milliseconds taken by all reduce tasks=1513984 Map-Reduce Framework Map input records=3 Map output records=18 Map output bytes=158 Map output materialized bytes=220 Input split bytes=384 Combine input records=0 Combine output records=0 Reduce input groups=12 Reduce shuffle bytes=220 Reduce input records=18 Reduce output records=12 Spilled Records=36 Shuffled Maps =3 Failed Shuffles=0 Merged Map outputs=3 GC time elapsed (ms)=174 CPU time spent (ms)=1840 Physical memory (bytes) snapshot=575270912 Virtual memory (bytes) snapshot=2897014784 Total committed heap usage (bytes)=194510848 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=86 File Output Format Counters Bytes Written=80 [cloudera@quickstart ~]$ hadoop fs -cat /user/ruso/wordcount/output/* Hadoop 3 Oh 1 a 1 an 1 as 2 be 1 can 1 elephant 1 fellow 1 is 3 what 1 yellow 2 hadoop fs -put shakespeare.txt /user/ruso/wordcount/input hadoop hdfs dfs -rm -r /user/ruso/wordcount/output hadoop jar wordcount.jar org.myorg.WordCount /user/ruso/wordcount/input /user/ruso/wordcount/output