Hadoop實(shí)驗(yàn)——MapReduce編程(1)

實(shí)驗(yàn)?zāi)康?/h1>

  1. 通過實(shí)驗(yàn)掌握基本的MapReduce編程方法。
  2. 掌握用MapReduce解決一些常見的數(shù)據(jù)處理問題,包括數(shù)據(jù)去重、數(shù)據(jù)排序和數(shù)據(jù)挖掘等。
  3. 通過操作MapReduce的實(shí)驗(yàn),模仿實(shí)驗(yàn)內(nèi)容,深入理解MapReduce的過程,熟悉MapReduce程序的編程方式。

實(shí)驗(yàn)平臺(tái)

  • 操作系統(tǒng):Ubuntu-16.04
  • Hadoop版本:2.6.0
  • JDK版本:1.8
  • IDE:Eclipse

實(shí)驗(yàn)內(nèi)容和要求

一,編程實(shí)現(xiàn)文件合并和去重操作:

  1. 對(duì)于兩個(gè)輸入文件,即文件A和文件B,請(qǐng)編寫MapReduce程序,對(duì)兩個(gè)文件進(jìn)行合并,并剔除其中重復(fù)的內(nèi)容,得到一個(gè)新的輸出文件C。下面是輸入文件和輸出文件的一個(gè)樣例供參考。
  • 輸入文件f1.txt的樣例如下:
20150101     x
20150102     y
20150103     x
20150104     y
20150105     z
20150106     x
  • 輸入文件f2.txt的樣例如下:
20150101     y
20150102     y
20150103     x
20150104     z
20150105     y
  • 根據(jù)輸入文件f1和f2合并得到的輸出文件的樣例如下:
20150101      x
20150101      y
20150102      y
20150103      x
20150104      y
20150104      z
20150105      y
20150105      z
20150106      x

實(shí)驗(yàn)過程:

  1. 創(chuàng)建文件f1.txt和f2.txt


    將上面樣例內(nèi)容復(fù)制進(jìn)去
  2. 在HDFS建立input文件夾(執(zhí)行這步之前要開啟hadoop相關(guān)進(jìn)程)


  3. 上傳樣例到HDFS中的input文件夾


  4. 接著打開eclipse
    Eclipse的使用
    1. 點(diǎn)開項(xiàng)目,找到 src 文件夾,右鍵選擇 New -> Class


    2. 輸入 Package 和 Name,然后Finish


    3. 寫好Java代碼(給的代碼里要修改HDFS和本地路徑),右鍵選擇 Run As -> Run on Hadoop,結(jié)果在HDFS系統(tǒng)中查看


實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Merge {

    public static class Map extends Mapper<Object, Text, Text, Text> {
        private static Text text = new Text();

        @Override
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            text = value;
            context.write(text, new Text(""));
        }
    }

    public static class Reduce extends Reducer<Text, Text, Text, Text> {
        @Override
        public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
            context.write(key, new Text(""));
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] otherArgs = new String[]{"input", "output"};
        if (otherArgs.length != 2) {
            System.err.println("Usage: Merge and duplicate removal <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "Merge");
        job.setJarByClass(Merge.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

}
模仿上題完成以下內(nèi)容:對(duì)于兩個(gè)輸入文件,即文件A和文件B,請(qǐng)編寫MapReduce程序,對(duì)兩個(gè)文件進(jìn)行統(tǒng)計(jì)單詞數(shù)量,得到一個(gè)新的輸出文件C。下面是輸入文件和輸出文件的一個(gè)樣例供參考。
  • 輸入文件a.txt的樣例如下:
hello world 
wordcount java
android hbase
hive pig
  • 輸入文件b.txt的樣例如下:
hello hadoop 
spring mybatis
hive hbase
pig android
  • 輸出文件的結(jié)果為:
android  2
hadoop    1
hbase      2
hello      2
hive        2
java        1
mybatis  1
pig      2
spring    1
wordcount   1
world      1

實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

    public static class Map extends Mapper<Object, Text, Text, IntWritable> {
        private static final IntWritable one = new IntWritable(1);
        private Text word = new Text();

        @Override
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            String lineValue = value.toString();
            String[] words = lineValue.split(" ");
            for (String singleWord : words) {
                word.set(singleWord);
                context.write(word, one);
            }

        }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();

        @Override
        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] otherArgs = new String[]{"input_1", "output_1"};
        if (otherArgs.length != 2) {
            System.err.println("Usage: Wordcount <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "Wordcount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

二,編寫程序?qū)崿F(xiàn)對(duì)輸入文件的排序:

  1. 現(xiàn)在有多個(gè)輸入文件,每個(gè)文件中的每行內(nèi)容均為一個(gè)整數(shù)。要求讀取所有文件中的整數(shù),進(jìn)行升序排序后,輸出到一個(gè)新的文件中,輸出的數(shù)據(jù)格式為每行兩個(gè)整數(shù),第一個(gè)數(shù)字為第二個(gè)整數(shù)的排序位次,第二個(gè)整數(shù)為原待排列的整數(shù)。下面是輸入文件和輸出文件的一個(gè)樣例供參考。
  • 輸入文件file1.txt的樣例如下:
33
37
12
40
  • 輸入文件file2.txt的樣例如下:
4
16
39
5
  • 輸入文件file3.txt的樣例如下:
1
45
25
  • 根據(jù)輸入文件file1.txt、file2.txt和file3.txt得到的輸出文件如下:
1 1
2 4
3 5
4 12
5 16
6 25
7 33
8 37
9 39
10 40
11 45

實(shí)驗(yàn)過程:

  1. 創(chuàng)建文件file1.txt、file2.txt和file3.txt


    將上面樣例內(nèi)容復(fù)制進(jìn)去
  2. 在HDFS建立input2文件夾


  3. 上傳樣例到HDFS中的input2文件夾


  4. 到eclipse上執(zhí)行代碼

實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class ContentSort {

    public static class Map extends Mapper<Object, Text, IntWritable, IntWritable> {
        private static IntWritable data = new IntWritable();

        @Override
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            data.set(Integer.parseInt(line));
            context.write(data, new IntWritable(1));
        }
    }

    public static class Reduce extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
        private static IntWritable linenum = new IntWritable(1);

        @Override
        public void reduce(IntWritable key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            for (IntWritable val : values) {
                context.write(linenum, key);
                linenum = new IntWritable(linenum.get() + 1);
            }

        }

    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] otherArgs = new String[]{"input2", "output2"};
        if (otherArgs.length != 2) {
            System.err.println("Usage: ContentSort <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "ContentSort");
        job.setJarByClass(ContentSort.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(IntWritable.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

}
模仿上題完成以下內(nèi)容:對(duì)于三個(gè)輸入文件,即文件math、文件china和文件english,請(qǐng)編寫MapReduce程序,對(duì)三個(gè)文件進(jìn)行統(tǒng)計(jì)平均分,得到一個(gè)新的輸出文件。下面是輸入文件和輸出文件的一個(gè)樣例供參考。
  • 輸入文件math.txt的樣例如下:
張三    88
李四    99
王五    66
趙六    77
  • 輸入文件algs.txt的樣例如下:
張三    78
李四    89
王五    96
趙六    67
  • 輸入文件english.txt的樣例如下:
張三    80
李四    82
王五    84
趙六    86
  • 輸出文件結(jié)果為:
張三    82
李四    90
王五    82
趙六    76

實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class AvgScore {

    public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
        @Override
        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
            String line = value.toString();
            String[] nameAndScore = line.split(" ");
            List<String> list = new ArrayList<>(2);
            for (String nameOrScore : nameAndScore) {
                if (!"".equals(nameOrScore)) {
                    list.add(nameOrScore);
                }
            }
            context.write(new Text(list.get(0)), new IntWritable(Integer.parseInt(list.get(1))));
        }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
        @Override
        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int sum = 0;
            int count = 0;
            for (IntWritable value : values) {
                sum += Integer.parseInt(value.toString());
                count++;
            }
            int average = sum / count;
            context.write(key, new IntWritable(average));
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] otherArgs = new String[]{"input_2", "output_2"};
        if (otherArgs.length != 2) {
            System.err.println("Usage: AvgScore <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "AvgScore");
        job.setJarByClass(AvgScore.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

三,對(duì)給定的表格進(jìn)行信息挖掘:

  1. 下面給出一個(gè)child-parent的表格,要求挖掘其中的父子輩關(guān)系,給出祖孫輩關(guān)系的表格。
  • 輸入文件table.txt內(nèi)容如下:
child parent
Steven Lucy
Steven Jack
Jone Lucy
Jone Jack
Lucy Mary
Lucy Frank
Jack Alice
Jack Jesse
David Alice
David Jesse
Philip David
Philip Alma
Mark David
Mark Alma
  • 輸出文件內(nèi)容如下:
grandchild  grandparent
Mark    Jesse
Mark    Alice
Philip  Jesse
Philip  Alice
Jone    Jesse
Jone    Alice
Steven  Jesse
Steven  Alice
Steven  Frank
Steven  Mary
Jone    Frank
Jone    Mary

實(shí)驗(yàn)過程:

  1. 創(chuàng)建文件table


    將上面樣例內(nèi)容復(fù)制進(jìn)去
  2. 在HDFS建立input3文件夾


  3. 上傳樣例到HDFS中的input3文件夾


  4. 到eclipse上執(zhí)行代碼

實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class STJoin {
    public static int time = 0;

    public static class Map extends Mapper<Object, Text, Text, Text> {
        @Override
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            String[] childAndParent = line.split(" ");
            List<String> list = new ArrayList<>(2);
            for (String childOrParent : childAndParent) {
                if (!"".equals(childOrParent)) {
                    list.add(childOrParent);
                }
            }
            if (!"child".equals(list.get(0))) {
                String childName = list.get(0);
                String parentName = list.get(1);
                String relationType = "1";
                context.write(new Text(parentName), new Text(relationType + "+"
                        + childName + "+" + parentName));
                relationType = "2";
                context.write(new Text(childName), new Text(relationType + "+"
                        + childName + "+" + parentName));
            }
        }
    }

    public static class Reduce extends Reducer<Text, Text, Text, Text> {
        @Override
        public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
            if (time == 0) {
                context.write(new Text("grand_child"), new Text("grand_parent"));
                time++;
            }
            List<String> grandChild = new ArrayList<>();
            List<String> grandParent = new ArrayList<>();
            for (Text text : values) {
                String s = text.toString();
                String[] relation = s.split("\\+");
                String relationType = relation[0];
                String childName = relation[1];
                String parentName = relation[2];
                if ("1".equals(relationType)) {
                    grandChild.add(childName);
                } else {
                    grandParent.add(parentName);
                }
            }
            int grandParentNum = grandParent.size();
            int grandChildNum = grandChild.size();
            if (grandParentNum != 0 && grandChildNum != 0) {
                for (int m = 0; m < grandChildNum; m++) {
                    for (int n = 0; n < grandParentNum; n++) {
                        context.write(new Text(grandChild.get(m)), new Text(
                                grandParent.get(n)));
                    }
                }
            }
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] otherArgs = new String[]{"input3", "output3"};
        if (otherArgs.length != 2) {
            System.err.println("Usage: Single Table Join <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "Single table Join ");
        job.setJarByClass(STJoin.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

}
模仿上題完成以下內(nèi)容:現(xiàn)有兩個(gè)輸入文件兩個(gè)文件,一個(gè)是工廠名與地址編號(hào)的對(duì)應(yīng)關(guān)系;另一個(gè)是地址編號(hào)和地址名的對(duì)應(yīng)關(guān)系。要求從輸入數(shù)據(jù)中找出工廠名和地址名的對(duì)應(yīng)關(guān)系,輸出"工廠名——地址名"表。
  • 輸入文件factory.txt:
factoryname addressID
Beijing Red Star   1
Shenzhen Thunder   3
Guangzhou Honda   2
Beijing Rising   1
Guangzhou Development Bank    2
Tencent   3
Bank of Beijing   1
  • 輸入文件address.txt:
addressID    addressname
1            Beijing
2            Guangzhou
3            Shenzhen
4            Xian
  • 輸出文件內(nèi)容如下:
factoryname addressname
Back of Beijing       Beijing 
Beijing Rising    Beijing 
Beijing Red Star      Beijing 
Guangzhou Development Bank    Guangzhou 
Guangzhou Honda           Guangzhou 
Tencent           Shenzhen 
Shenzhen Thunder          Shenzhen 

實(shí)驗(yàn)代碼:

package cn.edu.zucc.mapreduce;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;

public class MTJoin {
    public static int time = 0;

    public static class Map extends Mapper<Object, Text, Text, Text> {

        @Override
        protected void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            if (line.contains("factoryname") || line.contains("addressID")) {
                return;
            }
            String[] strings = line.split(" ");
            List<String> list = new ArrayList<>();
            for (String information : strings) {
                if (!"".equals(information)) {
                    list.add(information);
                }
            }
            String addressID;
            StringBuilder stringBuilder = new StringBuilder();
            if (StringUtils.isNumeric(list.get(0))) {
                addressID = list.get(0);
                for (int i = 1; i < list.size(); i++) {
                    if (i != 1) {
                        stringBuilder.append(" ");
                    }
                    stringBuilder.append(list.get(i));
                }
                context.write(new Text(addressID), new Text("1+" + stringBuilder.toString()));
            } else {
                addressID = list.get(list.size() - 1);
                for (int i = 0; i < list.size() - 1; i++) {
                    if (i != 0) {
                        stringBuilder.append(" ");
                    }
                    stringBuilder.append(list.get(i));
                }
                context.write(new Text(addressID), new Text("2+" + stringBuilder.toString()));
            }
        }
    }

    public static class Reduce extends Reducer<Text, Text, Text, Text> {

        @Override
        protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
            if (time == 0) {
                context.write(new Text("factoryname"), new Text("addressname"));
                time++;
            }
            List<String> factory = new ArrayList<>();
            List<String> address = new ArrayList<>();
            for (Text text : values) {
                String s = text.toString();
                String[] relation = s.split("\\+");
                if ("1".equals(relation[0])) {
                    address.add(relation[1]);
                } else {
                    factory.add(relation[1]);
                }
            }
            int factoryNum = factory.size();
            int addressNum = address.size();
            if (factoryNum != 0 && addressNum != 0) {
                for (int m = 0; m < factoryNum; m++) {
                    for (int n = 0; n < addressNum; n++) {
                        context.write(new Text(factory.get(m)),
                                new Text(address.get(n)));
                    }
                }
            }
        }

    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://localhost:9000");
        String[] ioArgs = new String[]{"input_3", "output_3"};
        String[] otherArgs = new GenericOptionsParser(conf, ioArgs)
                .getRemainingArgs();
        if (otherArgs.length != 2) {
            System.err.println("Usage: Multiple Table Join <in> <out>");
            System.exit(2);
        }
        Job job = Job.getInstance(conf, "Mutiple table join ");
        job.setJarByClass(MTJoin.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);

    }

}
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 實(shí)驗(yàn)?zāi)康?通過實(shí)驗(yàn)掌握基本的MapReduce編程方法。 掌握用MapReduce解決一些常見的數(shù)據(jù)處理問題,包括...
    Tiny_16閱讀 6,951評(píng)論 0 5
  • 當(dāng)數(shù)據(jù)量增大到超出了單個(gè)物理計(jì)算機(jī)存儲(chǔ)容量時(shí),有必要把它分開存儲(chǔ)在多個(gè)不同的計(jì)算機(jī)中。那些管理存儲(chǔ)在多個(gè)網(wǎng)絡(luò)互連的...
    單行線的旋律閱讀 2,082評(píng)論 0 7
  • 先思考問題 我們處在一個(gè)大數(shù)據(jù)的時(shí)代已經(jīng)是不爭(zhēng)的事實(shí),這主要表現(xiàn)在數(shù)據(jù)源多且大,如互聯(lián)網(wǎng)數(shù)據(jù),人們也認(rèn)識(shí)到數(shù)據(jù)里往...
    墻角兒的花閱讀 7,685評(píng)論 0 9
  • 摘自:http://staticor.io/post/hadoop/2016-01-23hadoop-defini...
    wangliang938閱讀 693評(píng)論 0 1
  • hadoop是什么?HDFS與MapReduceHive:數(shù)據(jù)倉庫,在HDFS之上,后臺(tái)執(zhí)行,幫你執(zhí)行。faceb...
    Babus閱讀 2,674評(píng)論 0 5

友情鏈接更多精彩內(nèi)容