如何在Scala上做增量覆蓋率(incremental coverage)計(jì)算?
Created: June 24, 2022 10:56 AM
Last Edited Time: June 24, 2022 11:05 AM
Tag: Scala, Spark
Type: Sharing Blog
在Scala上寫單測(cè)
下面是之前寫的英文版簡(jiǎn)述,直接粘貼過(guò)來(lái)了。 你還可以參考:Spark-Scala單元測(cè)試實(shí)踐 - 碼農(nóng)教程 (manongjc.com)
Unit Tests
Unit testing is a powerful tool for software quality -- and has been for decades. Unit tests provide a fundamental check that an application meets its software design specifications and behaves as intended.
What tools are used?
UnitTest Tool: ScalaTest
Mock Tool: Mockito Scala
HTTP Mock Tool: WireMock
Code Coverage Tool: Scoverage
How to write tests?
First, you need to know that writing a single test is easy. More information on ScalaTest
Tips:
- You can mock functions by Mockito Scala
- You can mock http service by WireMock
- When there are public functions, make full use of
before,afterandwithFixture.
重點(diǎn)來(lái)了,如何做增量覆蓋率測(cè)試?
我們用 Scoverage 做覆蓋率測(cè)試,但是它不支持增量覆蓋率測(cè)試,所以我們需要手動(dòng)改造一下。 如果你用 jacoco做的覆蓋率測(cè)試,可以試試 jacoco: jacoco二開(kāi),支持增量代碼覆蓋率 (gitee.com) 這篇文章。
大概的思路是:
- Calculate the number of files and lines with changes through git diff
- Using the feature of the scoverage, comments are added before and after the number of lines that have changed.
see: https://github.com/scoverage/scalac-scoverage-plugin
貼上python腳本: HandleIncrementalCoverage.py
#!/usr/bin/env python
"""
Handle code for Incremental Coverage
Tips: only active with scoverage
Principle:
1. Calculate the number of files and lines with changes through git diff
2. Using the feature of the scoverage, comments are added before and after the number of lines that have changed.
see: <https://github.com/scoverage/scalac-scoverage-plugin>
"""
import os
import re
import subprocess
import sys
def getChangedLineInfoFromDiffLines(lines):
"""
args : Lines, the description of a file output by the git-diff command
returns : List[(changedLineStart, changedLineEnd)], Front-closed and back-open interval
"""
changedLineInfo = []
# Get line change information according to "@"
# Matching [0]: "," + the number of rows deleted from here;
# [1]: the number of rows added here;
# [2]: "," + the number of rows added from here
reg = re.compile("^@@ -[0-9]+(,[0-9]+)? \\+([0-9]+)(,[0-9]+)? @@")
for line in lines:
r = reg.findall(line)
if len(r) > 0:
changedLineStart = int(r[0][1])
caughtLineCountStr = r[0][2]
if len(caughtLineCountStr) > 0:
changedLineCount = int(caughtLineCountStr[1:])
else:
changedLineCount = 1
changedLineInfo.append((changedLineStart, changedLineStart + changedLineCount))
return changedLineInfo
def getDiffLines(baseBranch='HEAD~1', newBranch='HEAD', dir="./"):
"""get diff lines from two branches."""
gitCmd = f"git diff --unified=0 --diff-filter=d {baseBranch} {newBranch} {dir}"
print("Git Cmd: ", gitCmd)
gitDiffOutputRaw = subprocess.check_output(gitCmd.split(" "))
outputStr = gitDiffOutputRaw.decode('utf-8')
diffOutputLines = outputStr.splitlines()
map = {}
separateLineReg = re.compile("^diff --git a/\\S+ b/(\\S+)")
currentCheckFileName = ""
diffLinesForCurrentCheckFile = []
for i in range(len(diffOutputLines)):
l = diffOutputLines[i]
separateLineMatchResult = separateLineReg.findall(l)
if len(separateLineMatchResult) > 0:
if len(diffLinesForCurrentCheckFile) > 0:
a = getChangedLineInfoFromDiffLines(diffLinesForCurrentCheckFile)
map[currentCheckFileName] = a
diffLinesForCurrentCheckFile.clear()
# filter submodule name
currentCheckFileName = '/'.join((separateLineMatchResult[0]).split('/')[1:])
else:
diffLinesForCurrentCheckFile.append(l)
if i == len(diffOutputLines) - 1:
a = getChangedLineInfoFromDiffLines(diffLinesForCurrentCheckFile)
map[currentCheckFileName] = a
print("Git Diff Output: ", map)
return map
def findAllFile(base):
for root, ds, fs in os.walk(base):
for f in fs:
if f.endswith('.scala'):
fullname = os.path.join(root, f).replace('\\\\', '/')
yield fullname
def preHandleIncrementalCoverage(diffDict):
"""PreHandle Incremental Coverage"""
for i in findAllFile("src/main/scala"):
diffInfo = diffDict[i] if i in diffDict else []
print("Add annotation for:", i, diffInfo)
fileData, diffLen = [], len(diffInfo)
with open(i, "r", encoding="utf-8") as f:
lineNum, diffIndex, diffOffset = 0, 0, 0
for line in f:
if "auto add for Incremental Coverage" in line:
continue
lineNum += 1
if lineNum == 2:
fileData.append("http:// $COVERAGE-OFF$ auto add for Incremental Coverage\\n")
if diffIndex < diffLen and diffInfo[diffIndex][diffOffset] == lineNum:
if diffOffset == 0:
fileData.append("http:// $COVERAGE-ON$ auto add for Incremental Coverage\\n")
diffOffset = 1
else:
fileData.append("http:// $COVERAGE-OFF$ auto add for Incremental Coverage\\n")
diffOffset = 0
diffIndex += 1
fileData.append(line)
with open(i, "w", encoding="utf-8") as f:
f.write("".join(fileData))
def cleanIncrementalCoverage():
"""clean Incremental Coverage"""
for i in findAllFile("src/main/scala"):
fileData = []
with open(i, "r", encoding="utf-8") as f:
for line in f:
if "auto add for Incremental Coverage" in line:
continue
fileData.append(line)
with open(i, "w", encoding="utf-8") as f:
f.write("".join(fileData))
if __name__ == '__main__':
# os.getcwd() is the root directory of the submodule
if len(sys.argv) == 2 and sys.argv[1] == "clean":
cleanIncrementalCoverage()
exit(0)
if len(sys.argv) != 3:
raise Exception("Argv not enough. Usage: python3 HandleIncrementalCoverage.py baseBranch newBranch")
baseBranch = sys.argv[1]
newBranch = sys.argv[2]
diff = getDiffLines(baseBranch, newBranch, "./src/main/scala")
preHandleIncrementalCoverage(diff)
對(duì)應(yīng)的再改一下pom.xml
注意:這里的依賴只是ut用到的一些依賴和插件,不包括項(xiàng)目用的,請(qǐng)根據(jù)實(shí)際情況進(jìn)行改造
<properties>
<hadoop.version>3.2.3</hadoop.version>
<spark.version>3.1.2</spark.version>
<scala.binary.version>2.12</scala.binary.version>
<scala.version>2.12.10</scala.version>
<scalatest.version>3.2.0</scalatest.version>
<scalatra.version>2.5.0</scalatra.version>
<json4s.version>3.6.6</json4s.version>
<commons.httpclient.version>4.5.6</commons.httpclient.version>
<skipUT>false</skipUT>
<notIncrementalCoverage>true</notIncrementalCoverage>
<baseBranch>HEAD~1</baseBranch>
<newBranch>HEAD</newBranch>
<skipUT>true</skipUT>
<notIncrementalCoverage>true</notIncrementalCoverage>
</properties>
<dependencies>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>${scalatest.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalactic</groupId>
<artifactId>scalactic_${scala.binary.version}</artifactId>
<version>3.2.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.pegdown</groupId>
<artifactId>pegdown</artifactId>
<version>1.4.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.vladsch.flexmark</groupId>
<artifactId>flexmark-all</artifactId>
<version>0.35.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-scala_${scala.binary.version}</artifactId>
<version>1.16.37</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-azure</artifactId>
<version>${hadoop.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.github.tomakehurst</groupId>
<artifactId>wiremock-jre8-standalone</artifactId>
<version>2.33.2</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>${basedir}/src/main/scala</sourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<outputDirectory>${project.parent.basedir}/target/</outputDirectory>
</configuration>
</plugin>
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestResult.txt</filereports>
<htmlreporters>${project.build.directory}/site/scalatest</htmlreporters>
<testFailureIgnore>false</testFailureIgnore>
<skipTests>${skipUT}</skipTests>
</configuration>
<executions>
<execution>
<id>test</id>
<phase>test</phase>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<executions>
<execution>
<id>add annotation for incremental coverage</id>
<phase>compile</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<skip>${notIncrementalCoverage}</skip>
<executable>python</executable>
<commandlineArgs>HandleIncrementalCoverage.py ${baseBranch} ${newBranch}</commandlineArgs>
</configuration>
</execution>
<execution>
<id>remove annotation for incremental coverage</id>
<phase>clean</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>python</executable>
<commandlineArgs>HandleIncrementalCoverage.py clean</commandlineArgs>
</configuration>
</execution>
</executions>
</plugin>
<!--https://github.com/scoverage/scoverage-maven-plugin-->
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<executions>
<execution>
<id>test</id>
<phase>test</phase>
<goals>
<goal>report</goal>
</goals>
</execution>
</executions>
<configuration>
<skip>${skipUT}</skip>
<scalaVersion>${scala.version}</scalaVersion>
<aggregate>true</aggregate>
<highlighting>true</highlighting>
<encoding>${project.build.sourceEncoding}</encoding>
</configuration>
</plugin>
</plugins>
</build>
計(jì)算增量覆蓋率:
echo "calculate incremental coverage between master and HEAD"
mvn -P spark-3.1 test -DargLine="-DnotIncrementalCoverage=false" "-DbaseBranch=origin/master" "-DnewBranch=HEAD" "-DnotIncrementalCoverage=false"
計(jì)算全量覆蓋率:
mvn -P spark-3.1 test
本文由博客一文多發(fā)平臺(tái) OpenWrite 發(fā)布!