MapReduce 基礎(chǔ) (三)分區(qū)

概念

在 MapReduce 中, 通過(guò)我們指定分區(qū), 會(huì)將同一個(gè)分區(qū)的數(shù)據(jù)發(fā)送到同一個(gè) Reduce 當(dāng)中進(jìn)行 處理

例如: 為了數(shù)據(jù)的統(tǒng)計(jì), 可以把一批類(lèi)似的數(shù)據(jù)發(fā)送到同一個(gè) Reduce 當(dāng)中, 在同一個(gè) Reduce 當(dāng) 中統(tǒng)計(jì)相同類(lèi)型的數(shù)據(jù), 就可以實(shí)現(xiàn)類(lèi)似的數(shù)據(jù)分區(qū)和統(tǒng)計(jì)等

其實(shí)就是相同類(lèi)型的數(shù)據(jù), 有共性的數(shù)據(jù), 送到一起去處理

Reduce 當(dāng)中默認(rèn)的分區(qū)只有一個(gè)

image

案例:
表中的數(shù)據(jù)分區(qū)成兩組,一組是中獎(jiǎng)結(jié)果大于15的分區(qū)查詢(xún),一組的中獎(jiǎng)結(jié)果小于15的分區(qū)查詢(xún)。

PartitionReducer

package Partitioner;

import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class PartitionReducer extends Reducer<Text, NullWritable, Text,NullWritable> {
    @Override
    protected void reduce(Text key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException {
        context.write(key,NullWritable.get());
    }
}

PartitionMapper

package Partitioner;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;

public class PartitionMapper extends Mapper<LongWritable,Text, Text, NullWritable> {
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        context.write(value,NullWritable.get());
    }
}

Partition

package Partitioner;

import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Partitioner;

public class PartitonerOwn extends Partitioner<Text, NullWritable> {

    @Override
    public int Partition(Text text, NullWritable nullWritable, int i) {
        //1:拆分行文本數(shù)據(jù)(K2),獲取中獎(jiǎng)字段的值
        String[] split = text.toString().split("\t");
        String numStr = split[5];

        //2:判斷中獎(jiǎng)字段的值和15的關(guān)系,然后返回對(duì)應(yīng)的分區(qū)編號(hào)
        if(Integer.parseInt(numStr) > 15){
            return  1;
        }else{
            return  0;
        }
    }
}

main

package Partitioner;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.net.URI;

public class MainJob extends Configured implements Tool {

    @Override
    public int run(String[] strings) throws Exception {

        //1.創(chuàng)建job任務(wù)對(duì)象
        Job job = Job.getInstance(super.getConf(), "partition_mapreduce");

        //2.對(duì)job任務(wù)進(jìn)行配置(八個(gè)步驟)

        //1)設(shè)置輸入類(lèi)和路徑
        job.setInputFormatClass(TextInputFormat.class);
        TextInputFormat.addInputPath(job,new Path("hdfs://node1:8020/input"));
        
        //2) 設(shè)置mapper類(lèi)
        job.setMapperClass(PartitionerMapper.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(NullWritable.class);
        
        //3) 指定分區(qū)類(lèi)
        job.setPartitionerClass(PartitonerOwn.class);
        
        //7) 指定Reducer類(lèi)
        job.setReducerClass(MyReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);
        
        //設(shè)置reduceTask個(gè)數(shù)
        job.setNumReduceTasks(2);

        //8)指定輸出類(lèi)和路徑
        Path path = new Path("hdfs://node1:8020/out/partition_out");
        job.setOutputFormatClass(TextOutputFormat.class);
        TextOutputFormat.setOutputPath(job,path);

        FileSystem fileSystem = FileSystem.get(new URI("hdfs://node1:8020/out/partition_out"),new Configuration());
        if (fileSystem.exists(path)){
            fileSystem.delete(path,true);
        }

        //3.等待任務(wù)執(zhí)行
        boolean bl = job.waitForCompletion(true);
        return bl?0:1;
    }

    public static void main(String[] args) throws Exception{
        Configuration configuration = new Configuration();
        
        //啟動(dòng)Job任務(wù)
        int run = ToolRunner.run(configuration,new MainJob(),args);

        System.exit(run);
    }
}
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容