題目:隨機(jī)生成 Salary {name, baseSalary, bonus }的記錄,如“wxxx,10,1”,每行一條記錄,總共1000萬記錄,寫入文本文件(UFT-8編碼),
然后讀取文件,name的前兩個(gè)字符相同的,其年薪累加,比如wx,100萬,3個(gè)人,最后做排序和分組,輸出年薪總額最高的10組:
wx, 200萬,10人
lt, 180萬,8人
....
name 4位a-z隨機(jī), baseSalary [0,100]隨機(jī) bonus[0-5]隨機(jī) 年薪總額 = baseSalary*13 + bonus
請努力將程序優(yōu)化到5秒內(nèi)執(zhí)行完
實(shí)體對象:
@Getter@Setter@ToString
public class Salary {
private String name; //員工姓名
private Integer baseSalary; //基礎(chǔ)工資
private Integer bonus; //獎金
}
要求生成隨機(jī)數(shù)據(jù)格式:{"baseSalary":26,"bonus":4,"name":"lvox"}
/**
* 生成1000w條隨機(jī)數(shù)據(jù)
*/
public class SalaryTest {
public static void main(String[] args) throws IOException {
File file = new File("D:/upload/test.txt");
FileWriter out = new FileWriter(file, true);
Integer count = 10000000;
while (true){
StringBuilder name = new StringBuilder(4);
String chars = "abcdefghijklmnopqrstuvwxyz";
for (int i = 0 ; i < 4; i++){
name.append(chars.charAt((int) (Math.random() * 26))) ;
}
Salary salary = new Salary();
salary.setName(name.toString());
salary.setBaseSalary((int) (Math.random() * 100 + 1));
salary.setBonus((int) (Math.random() * 5 + 1));
count --;
out.write(JSON.toJSONString(salary));
out.write("\n");
if (count ==0){
out.close();
return;
}
}
}
}
生成結(jié)果下:
統(tǒng)計(jì)TopN:
public class Test {
public static void main(String[] args) throws IOException {
long time = new Date().getTime();
/**
* 其實(shí)也可以用隨機(jī)流;將文件分塊;多起幾個(gè)線程執(zhí)行;
這樣的做的話得將文件分塊;意思就是通過scan的api;先全文循環(huán)后標(biāo)記;分成幾份.
1000w的數(shù)據(jù)不大.這樣反而效率更低
*/
BufferedReader reader = new BufferedReader(new FileReader("D:/upload/test.txt"));
String line = null;
HashMap<String,Salary> map = new HashMap();
while((line = reader.readLine()) != null){
Salary salary = JSON.parseObject(String.valueOf(line), Salary.class);
String key = salary.getName().substring(0, 2);
Salary result = map.get(key);
if (result != null) {
result.setBaseSalary(result.getBaseSalary() + salary.getBaseSalary() * 13 + salary.getBonus());
result.setBonus(result.getBonus() +1);
}else {
result = new Salary();
result.setName(key);
result.setBonus(1);
result.setBaseSalary(salary.getBaseSalary() * 13 + salary.getBonus());
map.put(key,result);
}
}
ArrayList<Salary> values = new ArrayList();
Collection<Salary> co = map.values();
values.addAll(co);
/**
java8之后提供流排序;效率更高
**/
List<Salary> list = map.values().stream().sorted(new Comparator<Salary>() {
@Override
public int compare(Salary o1, Salary o2) {
return o2.getBaseSalary() - o1.getBaseSalary();
}
}).collect(Collectors.toList());
/* Collections.sort( values, new Comparator<Salary>() {
public int compare(Salary o1, Salary o2) {
return o2.getBaseSalary() - o1.getBaseSalary();
}
});
*/
System.out.println((new Date().getTime() - time));
for (int i =0 ; i < 10 ; i++){
System.out.println(list.get(i));
}
}
}
輸出結(jié)果:
運(yùn)行時(shí)間:2627毫秒
Salary(name=nb, baseSalary=10050715, bonus=15156)
Salary(name=nd, baseSalary=10012526, bonus=15093)
Salary(name=kl, baseSalary=10009653, bonus=15064)
Salary(name=qj, baseSalary=10005223, bonus=15077)
Salary(name=qo, baseSalary=9998526, bonus=15052)
Salary(name=ug, baseSalary=9992740, bonus=14993)
Salary(name=ky, baseSalary=9991997, bonus=15013)
Salary(name=dv, baseSalary=9976088, bonus=15101)
Salary(name=zk, baseSalary=9974697, bonus=15071)
Salary(name=qb, baseSalary=9961493, bonus=14974)
寫在最后:
關(guān)于文件分塊的;其實(shí)就是預(yù)先對一個(gè)數(shù)據(jù)做預(yù)處理;下會就可以起多個(gè)線程;如果文件只執(zhí)行一次;就看業(yè)務(wù)的復(fù)雜的程度和數(shù)據(jù)量來定;不一定線程越多效率越高.. 不要一味的追求多線程