在线伊人91,久久九九一区二区三区

文檔的源代碼地址和軟件，需要的下載就可以了（訪問密碼：7567）?

https://url56.ctfile.com/d/34653256-48746892-4c8f2e?p=7567

一、搭建本地環(huán)境

1、下載準(zhǔn)備兩個(gè)工具

Hadoop-2.7.3.tar.gz

Hadoop-2.7.3-winutils.exe.rar

2、將Hadoop-2.7.3-winutils.exe.rar解壓后，其中的兩個(gè)文件進(jìn)行拷貝

Hadoop.dll

Wintuils.exe

3、將Hadoop-2.7.3.tar.gz解壓后，找到bin目錄，把上面的兩個(gè)文件Hadoop.dll、Wintuils.exe拷貝到當(dāng)前位置

4、配置Hadoop的環(huán)境變量

5、找到Hadoop中的日志文件log4j.properties拷貝到我們新建的Eclipse中的Maven項(xiàng)目中，這個(gè)日志文件是方便我們使用的，不需要寫太多的配置，直接借用Hadoop中文件內(nèi)容，也可以自己創(chuàng)建該日志文件，編寫里面的內(nèi)容。

（1）Hadoop中日志文件的位置

(2)拷貝到Eclipse中項(xiàng)目的位置

二、代碼編寫

1、編寫Mapper

2、編寫Reduce

3、編寫主類

4、運(yùn)行測試，首先我們先打一個(gè)JAR包

5、我導(dǎo)出到本地項(xiàng)目中了

6、將包上傳到我們的虛擬機(jī)中

7、上傳我們的測試文件，測試文件的文本結(jié)構(gòu)如下，可以自己編寫，中間使用空格隔開的。

hello everyone

hello hadoop

hello hive

go home

come on

8、我們運(yùn)行一下

9、我們查看一下瀏覽器，運(yùn)行后的結(jié)果

10、在虛擬機(jī)查看一下文本內(nèi)容

三、單詞統(tǒng)計(jì)理解

（一）概念

1、單詞統(tǒng)計(jì)的是統(tǒng)計(jì)一個(gè)文件中單詞出現(xiàn)的次數(shù)，比如下面的數(shù)據(jù)源

2、其中,最終出現(xiàn)的次數(shù)結(jié)果應(yīng)該是下面的顯示

（二）那么在MapReduce中該如何編寫代碼并出現(xiàn)最終結(jié)果？

首先我們把文件上傳到HDFS中(hdfs dfs –put …)

數(shù)據(jù)名稱：data.txt，大小是size是2G

（三）進(jìn)一步理解

1、紅黃綠三個(gè)塊表示的是數(shù)據(jù)存放的塊

2、然后數(shù)據(jù)data.txt進(jìn)入map階段，會(huì)以<k,v>(KV對(duì))的形式進(jìn)入，K表示的是：每行首字母相對(duì)于文件頭的字節(jié)偏移量，V表示的是每一行的文本。

3、那么我可以用圖表示：藍(lán)色的橢圓球表示一個(gè)map，紅黃綠數(shù)據(jù)塊在進(jìn)入map階段的時(shí)候，數(shù)據(jù)的形式為左邊紅色的<k,v>(KV對(duì))的形式

4、經(jīng)過map處理，比如String.split(“”),做一次處理，數(shù)據(jù)會(huì)在不同的紅黃綠數(shù)據(jù)塊中變?yōu)橄旅娴腒V形式

5、我們?cè)谂渲肏adoop的時(shí)候或設(shè)置reduce的數(shù)量，假如有兩個(gè)reduce

Map執(zhí)行完的數(shù)據(jù)會(huì)放到對(duì)應(yīng)的reduce中，如下圖

6、這個(gè)地方有一個(gè)簡單的原理就是

Job.setNumReduce會(huì)設(shè)置reduce的數(shù)量

而HashPartioner類可以利用key.hashcode % reduce的結(jié)果，將不同的map結(jié)果輸入到不同的reduce中，比如a-e開頭的放到一個(gè)地方，e-z開頭的放到一個(gè)地方，那么

7、這樣的數(shù)據(jù)結(jié)果就會(huì)變成

最終出現(xiàn)我們想要的結(jié)果，統(tǒng)計(jì)完成

四、練習(xí)

1、準(zhǔn)備的數(shù)據(jù)：data.txt。文本內(nèi)容：

hello everyone

hello hadoop

hello hive

go home

come on

2、項(xiàng)目配置的pom文件

? 4.0.0? com.xlglvc.xx.mapredece? wordcount-client? 0.0.1-SNAPSHOT? jar? wordcount-client? http://maven.apache.org</url>? ? UTF-8? ? 1.7? ? 1.7? ? ? ? ? ? ? junit? ? ? junit? ? ? 4.11? ? ? test? ? ? ? ? ? ? ? ? org.apache.hadoop? ? ? hadoop-client? ? ? 2.7.3? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? maven-clean-plugin? ? ? ? ? 3.0.0? ? ? ? ? ? ? ? ? ? ? ? ? maven-resources-plugin? ? ? ? ? 3.0.2? ? ? ? ? ? ? ? ? ? ? ? ? maven-compiler-plugin? ? ? ? ? 3.7.0? ? ? ? ? ? ? ? ? ? ? ? ? maven-surefire-plugin? ? ? ? ? 2.20.1? ? ? ? ? ? ? ? ? ? ? ? ? maven-jar-plugin? ? ? ? ? 3.0.2? ? ? ? ? ? ? ? ? ? ? ? ? maven-install-plugin? ? ? ? ? 2.5.2? ? ? ? ? ? ? ? ? ? ? ? ? maven-deploy-plugin? ? ? ? ? 2.8.2? ? ? ? ? ? ? ? ? ? -->

3、Mapper文件

package com.xlglvc.xx.mapredece.wordcount_client;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Mapper.Context;

publicclassWordCountMapper extends Mapper{protectedvoidmap(LongWritable key, Text value, Mapper.Context context)

throws IOException, InterruptedException

{

Text keyout =new Text();

IntWritable valueout =newIntWritable(1);

String line = value.toString();

String[] words = line.split("");

for (String word : words) {

keyout.set(word);

context.write(keyout, valueout);

}

4、Reducer文件：

package com.xlglvc.xx.mapredece.wordcount_client;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.Reducer.Context;

publicclassWordCountReducer extends Reducer{protectedvoidreduce(Text key, Iterablevalues, Reducer.Context context)

throws IOException, InterruptedException

{

IntWritable valueout =new IntWritable();

intcount =0;

for (IntWritable value : values)

{

count += value.get();

}

valueout.set(count);

context.write(key, valueout);

}

5、主方法類WordCountDriver

package com.xlglvc.xx.mapredece.wordcount_client;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

publicclass WordCountDriver

{publicstaticvoid main(String[] args)

throws IOException, ClassNotFoundException, InterruptedException

{

Configuration conf =new Configuration();

Job job = Job.getInstance(conf);

job.setJarByClass(WordCountDriver.class);

FileInputFormat.addInputPath(job, newPath(args[0]));

FileOutputFormat.setOutputPath(job, newPath(args[1]));

job.setMapperClass(WordCountMapper.class);

job.setReducerClass(WordCountReducer.class);

job.setMapOutputKeyClass(Text.class);

job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class);

boolean result = job.waitForCompletion(true);

System.exit(result ?0:1);

}

6、打包到虛擬機(jī)運(yùn)行，方法和之前相同。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Hadoop中單詞統(tǒng)計(jì)案例

Hadoop中單詞統(tǒng)計(jì)案例

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Hadoop中單詞統(tǒng)計(jì)案例

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av