JMH,即Java Microbenchmark Harness 翻譯:java 微基準(zhǔn)測(cè)試 工具套件。
什么是JMH
JMH 是 OpenJDK 團(tuán)隊(duì)開(kāi)發(fā)的一款基準(zhǔn)測(cè)試工具,一般用于代碼的性能調(diào)優(yōu),精度甚至可以達(dá)到納秒級(jí)別,適用于 java 以及其他基于 JVM 的語(yǔ)言。和 Apache JMeter 不同,JMH 測(cè)試的對(duì)象可以是任一方法,顆粒度更小,而不僅限于rest api。
使用時(shí),我們只需要通過(guò)配置告訴 JMH 測(cè)試哪些方法以及如何測(cè)試,JMH 就可以為我們自動(dòng)生成基準(zhǔn)測(cè)試的代碼。
JMH生成基準(zhǔn)測(cè)試代碼的原理
我們只需要通過(guò)配置(主要是注解)告訴 JMH 測(cè)試哪些方法以及如何測(cè)試,JMH 就可以為我們自動(dòng)生成基準(zhǔn)測(cè)試的代碼。
那么 JMH 是如何做到的呢?
要使用 JMH,我們的 JMH 配置項(xiàng)目必須是 maven 項(xiàng)目。在一個(gè) JMH配置項(xiàng)目中,我們可以在pom.xml看到以下配置。JMH 自動(dòng)生成基準(zhǔn)測(cè)試代碼的本質(zhì)就是使用 maven 插件的方式,在 package 階段對(duì)配置項(xiàng)目進(jìn)行解析和包裝。
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.2</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>${uberjar.name}</finalName>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>org.openjdk.jmh.Main</mainClass>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
步驟
如果我有一個(gè) A 項(xiàng)目,我希望對(duì)這個(gè)項(xiàng)目里的某些方法進(jìn)行 JMH 測(cè)試,可以這么做:
- 創(chuàng)建單獨(dú)的 JMH 配置項(xiàng)目B。
新建一個(gè)獨(dú)立的配置項(xiàng)目 B(建議使用 archetype 生成,可以確保配置正確),B 依賴了 A。
當(dāng)然,我們也可以直接將項(xiàng)目 A 作為 JMH 配置項(xiàng)目,但這樣做會(huì)導(dǎo)致 JMH 滲透到 A 項(xiàng)目中,所以,最好不要這么做。
- 配置項(xiàng)目B。
在 B 項(xiàng)目里面,我們可以使用 JMH 的注解或?qū)ο髞?lái)指定測(cè)試哪些方法以及如何測(cè)試,等等。
- 構(gòu)建和運(yùn)行。
在正確配置 pom.xml 的前提下,使用 mvn 命令打包 B 項(xiàng)目,JMH 會(huì)為我們自動(dòng)生成基準(zhǔn)測(cè)試代碼,并單獨(dú)打包成 benchmarks.jar。運(yùn)行 benchmarks.jar,基準(zhǔn)測(cè)試就可以跑起來(lái)了。
創(chuàng)建 JMH 配置項(xiàng)目
為了保證配置的正確性,建議使用 archetype 生成 JMH 配置項(xiàng)目。cmd 運(yùn)行下面這段代碼:
mvn archetype:generate ^
-DinteractiveMode=false ^
-DarchetypeGroupId=org.openjdk.jmh ^
-DarchetypeArtifactId=jmh-java-benchmark-archetype ^
-DarchetypeVersion=1.25 ^
-DgroupId=cn.zzs.jmh ^
-DartifactId=jmh-test01 ^
-Dversion=1.0.0
注:如果使用 linux,請(qǐng)將“^”替代為“\”。
如果不用archetype, 那么手工添加依賴和插件, 可以參考上面archetype生成的pom文件。
2 例子
更多例子可以看官網(wǎng), 有30多個(gè)例子。
2.1 JMHFirstBenchmark.java
package com.gemantic.wealth.yunmatong.service.jmh;
import lombok.extern.slf4j.Slf4j;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.TimeValue;
import java.util.concurrent.TimeUnit;
@Slf4j
@BenchmarkMode(Mode.AverageTime)// 測(cè)試方法平均執(zhí)行時(shí)間
@OutputTimeUnit(TimeUnit.MICROSECONDS)// 輸出結(jié)果的時(shí)間粒度為微秒
@State(Scope.Benchmark) // 每個(gè)測(cè)試線程一個(gè)實(shí)例
public class JMHFirstBenchmark {
/*
* Most of the time, you need to maintain some state while the benchmark is
* running. Since JMH is heavily used to build concurrent benchmarks, we
* opted for an explicit notion of state-bearing objects.
*
* Below are two state objects. Their class names are not essential, it
* matters they are marked with @State. These objects will be instantiated
* on demand, and reused during the entire benchmark trial.
*
* The important property is that state is always instantiated by one of
* those benchmark threads which will then have the access to that state.
* That means you can initialize the fields as if you do that in worker
* threads (ThreadLocals are yours, etc).
*/
@State(Scope.Benchmark)
public static class BenchmarkState {
volatile double x = Math.PI;
}
@State(Scope.Thread)
public static class ThreadState {
volatile double x = Math.PI;
}
@Benchmark
public void measureUnshared(ThreadState state) {
// All benchmark threads will call in this method.
//
// However, since ThreadState is the Scope.Thread, each thread
// will have it's own copy of the state, and this benchmark
// will measure unshared case.
state.x++;
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info("measureUnshared:"+ state.x);
}
@Benchmark
public void measureShared(BenchmarkState state) {
// All benchmark threads will call in this method.
//
// Since BenchmarkState is the Scope.Benchmark, all threads
// will share the state instance, and we will end up measuring
// shared case.
state.x++;
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
log.info("measureShared:"+ state.x);
}
/*
* ============================== HOW TO RUN THIS TEST: ====================================
*
* You are expected to see the drastic difference in shared and unshared cases,
* because you either contend for single memory location, or not. This effect
* is more articulated on large machines.
*
* You can run this test:
*
* a) Via the command line:
* $ mvn clean install
* $ java -jar target/benchmarks.jar JMHSample_03 -wi 5 -i 5 -t 4 -f 1
* (we requested 5 measurement/warmup iterations, with 4 threads, single fork)
*
* b) Via the Java API:
* (see the JMH homepage for possible caveats when running from IDE:
* http://openjdk.java.net/projects/code-tools/jmh/)
*/
public static void main(String[] args) throws RunnerException {
// 可以通過(guò)注解
Options opt = new OptionsBuilder()
.include(JMHFirstBenchmark.class.getSimpleName())
.warmupIterations(3) // 預(yù)熱3次
.measurementIterations(2).measurementTime(TimeValue.valueOf("1s")) // 運(yùn)行5次,每次10秒
.threads(10) // 10線程并發(fā)
.forks(2)
.build();
new Runner(opt).run();
}
}
2.2 SecondBenchmark.java
setup&TearDown&Param
package com.gemantic.wealth.yunmatong.service.jmh;
import com.gemantic.wealth.yunmatong.service.jmh.service.Calculator;
import com.gemantic.wealth.yunmatong.service.jmh.service.MultithreadCalculator;
import com.gemantic.wealth.yunmatong.service.jmh.service.SinglethreadCalculator;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.All)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class SecondBenchmark {
@Param({"100000"})
private int length;
private int[] numbers;
private Calculator singleThreadCalc;
private Calculator multiThreadCalc;
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(SecondBenchmark.class.getSimpleName()) // .include("JMHF.*") 可支持正則
.forks(0)
.warmupIterations(2)
.measurementIterations(2).threads(10)
.build();
new Runner(opt).run();
}
@Benchmark
public long singleThreadBench() {
return singleThreadCalc.sum(numbers);
}
@Benchmark
public long multiThreadBench() {
return multiThreadCalc.sum(numbers);
}
@Setup(Level.Trial)
public void prepare() {
int n = length;
numbers =new int[n];
for (int i=0;i<n;i++){
numbers[i]=i;
}
singleThreadCalc = new SinglethreadCalculator();
multiThreadCalc = new MultithreadCalculator(Runtime.getRuntime().availableProcessors());
}
@TearDown
public void shutdown() {
singleThreadCalc.shutdown();
multiThreadCalc.shutdown();
}
}
2.3 ThirdBenchmark.java
group
package com.gemantic.wealth.yunmatong.service.jmh;
import com.gemantic.wealth.yunmatong.service.jmh.service.Calculator;
import com.gemantic.wealth.yunmatong.service.jmh.service.MultithreadCalculator;
import com.gemantic.wealth.yunmatong.service.jmh.service.SinglethreadCalculator;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.options.TimeValue;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
public class ThirdBenchmark {
@State(Scope.Group)
public static class BenchmarkState {
volatile double x = Math.PI;
}
@Benchmark
@Group("custom")
@GroupThreads(10)
public void read(BenchmarkState state) {
state.x++;
try {
Thread.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("ThirdBenchmark.read: "+ state.x);
}
@Benchmark
@Group("custom")
public void book(BenchmarkState state) {
state.x++;
try {
Thread.sleep(5);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("ThirdBenchmark.book: "+ state.x);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ThirdBenchmark.class.getSimpleName()) // .include("JMHF.*") 可支持正則
.forks(0)
.warmupIterations(0)
.measurementIterations(2).measurementTime(TimeValue.valueOf("10ms")).threads(5)
.build();
new Runner(opt).run();
}
}
3 常用注解說(shuō)明
3.1 @BenchmarkMode(Mode.All)
Mode有:
- Throughput: 整體吞吐量,例如“1秒內(nèi)可以執(zhí)行多少次調(diào)用” (thrpt,參加第5點(diǎn))
- AverageTime: 調(diào)用的平均時(shí)間,例如“每次調(diào)用平均耗時(shí)xxx毫秒”。(avgt)
- SampleTime: 隨機(jī)取樣,最后輸出取樣結(jié)果的分布,例如“99%的調(diào)用在xxx毫秒以內(nèi),99.99%的調(diào)用在xxx毫秒以內(nèi)”(simple)
- SingleShotTime: 以上模式都是默認(rèn)一次 iteration 是 1s,唯有 SingleShotTime 是只運(yùn)行一次。往往同時(shí)把 warmup 次數(shù)設(shè)為0,用于測(cè)試?yán)鋯?dòng)時(shí)的性能。(ss)
3.2 @OutputTimeUnit(TimeUnit.MILLISECONDS)
統(tǒng)計(jì)單位, 微秒、毫秒 、分、小時(shí)、天
3.3 @State
可參:JMHFirstBenchmark.java
類注解,JMH測(cè)試類必須使用@State注解,State定義了一個(gè)類實(shí)例的生命周期,可以類比Spring Bean的Scope。由于JMH允許多線程同時(shí)執(zhí)行測(cè)試,不同的選項(xiàng)含義如下:
Scope.Thread:默認(rèn)的State,每個(gè)測(cè)試線程分配一個(gè)實(shí)例;
Scope.Benchmark:所有測(cè)試線程共享一個(gè)實(shí)例,用于測(cè)試有狀態(tài)實(shí)例在多線程共享下的性能;
Scope.Group:每個(gè)線程組共享一個(gè)實(shí)例;
3.4 @Benchmark
很重要的方法注解,表示該方法是需要進(jìn)行 benchmark 的對(duì)象。和@test 注解一致
3.5 @Setup
方法注解,會(huì)在執(zhí)行 benchmark 之前被執(zhí)行,正如其名,主要用于初始化。
3.6 @TearDown (Level)
方法注解,與@Setup 相對(duì)的,會(huì)在所有 benchmark 執(zhí)行結(jié)束以后執(zhí)行,主要用于資源的回收等。
(Level) 用于控制 @Setup,@TearDown 的調(diào)用時(shí)機(jī),默認(rèn)是 Level.Trial。
Trial:每個(gè)benchmark方法前后;
Iteration:每個(gè)benchmark方法每次迭代前后;
Invocation:每個(gè)benchmark方法每次調(diào)用前后,謹(jǐn)慎使用,需留意javadoc注釋;
3.7 @Param
@Param注解接收一個(gè)String數(shù)組 ,
可以用來(lái)指定某項(xiàng)參數(shù)的多種情況。特別適合用來(lái)測(cè)試一個(gè)函數(shù)在不同的參數(shù)輸入的情況下的性能。
可參:JMHFirstBenchmark.java
4 Options常用選項(xiàng)
4.1 include
benchmark 所在的類的名字,這里可以使用正則表達(dá)式對(duì)所有類進(jìn)行匹配。
參考:SecondBenchmark.java
4.2 fork
JVM因?yàn)槭褂昧藀rofile-guided optimization而“臭名昭著”,這對(duì)于微基準(zhǔn)測(cè)試來(lái)說(shuō)十分不友好,因?yàn)椴煌瑴y(cè)試方法的profile混雜在一起,“互相傷害”彼此的測(cè)試結(jié)果。對(duì)于每個(gè)@Benchmark方法使用一個(gè)獨(dú)立的進(jìn)程可以解決這個(gè)問(wèn)題,這也是JMH的默認(rèn)選項(xiàng)。注意不要設(shè)置為0,設(shè)置為n則會(huì)啟動(dòng)n個(gè)進(jìn)程執(zhí)行測(cè)試(似乎也沒(méi)有太大意義)。
fork選項(xiàng)也可以通過(guò)方法注解以及啟動(dòng)參數(shù)來(lái)設(shè)置。
4.3 warmupIterations
預(yù)熱次數(shù),每次默認(rèn)1秒。
4.4 measurementIterations
實(shí)際測(cè)量的迭代次數(shù),每次默認(rèn)1秒。
4.5 Group
方法注解,可以把多個(gè) benchmark 定義為同一個(gè) group,則它們會(huì)被同時(shí)執(zhí)行,譬如用來(lái)模擬生產(chǎn)者-消費(fèi)者讀寫速度不一致情況下的表現(xiàn)。
4.6 Threads
每個(gè)fork進(jìn)程使用多少條線程去執(zhí)行你的測(cè)試方法,默認(rèn)值是Runtime.getRuntime().availableProcessors()。
5 輸出結(jié)果
# @BenchmarkMode(Mode.All)
# JMH version: 1.19
# VM version: JDK 1.7.0_80, VM 24.80-b11
# VM invoker: C:\Program Files\Java\jdk1.7.0_80\jre\bin\java.exe
# VM options: -javaagent:D:\Program Files\JetBrains\IntelliJ IDEA 2018.1\lib\idea_rt.jar=51664:D:\Program Files\JetBrains\IntelliJ IDEA 2018.1\bin -Dfile.encoding=UTF-8
# Warmup: 2 iterations, single-shot each
# Measurement: 2 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 10 threads
# Benchmark mode: Single shot invocation time
# Benchmark: com.gemantic.wealth.yunmatong.service.jmh.SecondBenchmark.singleThreadBench
# Parameters: (length = 100000)
# Run progress: 99.98% complete, ETA 00:00:00
# Fork: 1 of 1
# Warmup Iteration 1: 34.641 ±(99.9%) 33.844 ms/op
# Warmup Iteration 2: 7.129 ±(99.9%) 9.238 ms/op
Iteration 1: 7.573 ±(99.9%) 4.581 ms/op
Iteration 2: 6.235 ±(99.9%) 4.150 ms/op
# Run complete. Total time: 00:00:36
Benchmark (length) Mode Cnt Score Error Units
SecondBenchmark.multiThreadBench 100000 thrpt 2 147.758 ops/ms
SecondBenchmark.singleThreadBench 100000 thrpt 2 0.983 ops/ms
SecondBenchmark.multiThreadBench 100000 avgt 2 0.068 ms/op
SecondBenchmark.singleThreadBench 100000 avgt 2 10.510 ms/op
SecondBenchmark.multiThreadBench 100000 sample 295532 0.068 ± 0.001 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.00 100000 sample 0.010 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.50 100000 sample 0.066 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.90 100000 sample 0.095 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.95 100000 sample 0.104 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.99 100000 sample 0.126 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.999 100000 sample 0.172 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p0.9999 100000 sample 1.729 ms/op
SecondBenchmark.multiThreadBench:multiThreadBench·p1.00 100000 sample 4.309 ms/op
SecondBenchmark.singleThreadBench 100000 sample 2036 10.196 ± 0.581 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.00 100000 sample 6.201 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.50 100000 sample 8.020 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.90 100000 sample 10.355 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.95 100000 sample 38.443 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.99 100000 sample 41.943 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.999 100000 sample 73.498 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p0.9999 100000 sample 74.973 ms/op
SecondBenchmark.singleThreadBench:singleThreadBench·p1.00 100000 sample 74.973 ms/op
SecondBenchmark.multiThreadBench 100000 ss 2 0.223 ms/op
SecondBenchmark.singleThreadBench 100000 ss 2 6.904 ms/op
6 IDE支持
IDEA的插件安裝界面里, 搜索JMH就可以了
https://zhuanlan.zhihu.com/p/74891608
7 進(jìn)階
可以看一看clickhouse-jdbc的源碼, 里邊用到了JMH, 和Docker容器的配合使用等等。