放假期間我們學(xué)習(xí)一下基礎(chǔ)的東西
Corrplot軟件包簡介
介紹
所述corrplot包是相關(guān)矩陣,置信區(qū)間的圖形顯示。它還包含一些進(jìn)行矩陣重新排序的算法。另外,corrplot擅長細(xì)節(jié),包括選擇顏色,文本標(biāo)簽,顏色標(biāo)簽,布局等。
可視化方法
有七個可視化方法(參數(shù)method中)corrplot包,命名"circle","square","ellipse","number","shade","color","pie"。
正相關(guān)以藍(lán)色顯示,負(fù)相關(guān)以紅色顯示。顏色強(qiáng)度和圓圈的大小與相關(guān)系數(shù)成正比。
library(corrplot)
## corrplot 0.84 loaded
M <- cor(mtcars)
corrplot(M, method = "circle")
corrplot(M, method = "square")
corrplot(M, method = "ellipse")
corrplot(M, method = "number") # Display the correlation coefficient
corrplot(M, method = "shade")
corrplot(M, method = "color")
corrplot(M, method = "pie")
布局
共有三種布局類型(參數(shù)type):
-
"full"(默認(rèn)):顯示完整的相關(guān)矩陣 -
"upper":顯示相關(guān)矩陣的上三角 -
"lower":顯示相關(guān)矩陣的下三角
corrplot(M, type = "upper")
corrplot.mixed() 是混合可視化樣式的包裝函數(shù)。
corrplot.mixed(M)
corrplot.mixed(M, lower.col = "black", number.cex = .7)
corrplot.mixed(M, lower = "ellipse", upper = "circle")
corrplot.mixed(M, lower = "square", upper = "circle", tl.col = "black")
重新排序相關(guān)矩陣
相關(guān)矩陣可以根據(jù)相關(guān)系數(shù)重新排序。這對于確定矩陣中隱藏的結(jié)構(gòu)和圖案很重要。有在corrplot(參數(shù)四種方法order)的名字命名 "AOE","FPC","hclust","alphabet"。在序列化包中可以找到更多算法 。
您還可以通過function手動“重新排序”矩陣corrMatOrder()。
-
"AOE"一個我aia i = { 棕褐色(e i 2 / e i 1),如果 ? 我1 > 0 ;棕褐色(e i 2 / e i 1)+ π ,除此以外。ai={tan?(ei2/ei1),if ei1>0;tan?(ei2/ei1)+π,otherwise.
? 1e1? 2e2
"FPC"對于第一個主成分訂單。"hclust"層次聚類順序,以及"hclust.method"要使用的聚集方法。"hclust.method"應(yīng)該是一個"ward","single","complete","average","mcquitty","median"或"centroid"。"alphabet"按字母順序排列。
corrplot(M, order = "AOE")
corrplot(M, order = "hclust")
corrplot(M, order = "FPC")
corrplot(M, order = "alphabet")
如果使用"hclust",則corrplot()可以基于層次聚類的結(jié)果在相關(guān)矩陣圖的周圍繪制矩形。
corrplot(M, order = "hclust", addrect = 2)
corrplot(M, order = "hclust", addrect = 3)
Change background color to lightblue
corrplot(M, type = "upper", order = "hclust",
col = c("black", "white"), bg = "lightblue")
使用不同的色譜
col1 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "white",
"cyan", "#007FFF", "blue", "#00007F"))
col2 <- colorRampPalette(c("#67001F", "#B2182B", "#D6604D", "#F4A582",
"#FDDBC7", "#FFFFFF", "#D1E5F0", "#92C5DE",
"#4393C3", "#2166AC", "#053061"))
col3 <- colorRampPalette(c("red", "white", "blue"))
col4 <- colorRampPalette(c("#7F0000", "red", "#FF7F00", "yellow", "#7FFF7F",
"cyan", "#007FFF", "blue", "#00007F"))
whiteblack <- c("white", "black")
## using these color spectra
corrplot(M, order = "hclust", addrect = 2, col = col1(100))
corrplot(M, order = "hclust", addrect = 2, col = col2(50))
corrplot(M, order = "hclust", addrect = 2, col = col3(20))
corrplot(M, order = "hclust", addrect = 2, col = col4(10))
corrplot(M, order = "hclust", addrect = 2, col = whiteblack, bg = "gold2")
還可以使用標(biāo)準(zhǔn)調(diào)色板(包grDevices)
corrplot(M, order = "hclust", addrect = 2, col = heat.colors(100))
corrplot(M, order = "hclust", addrect = 2, col = terrain.colors(100))
corrplot(M, order = "hclust", addrect = 2, col = cm.colors(100))
corrplot(M, order = "hclust", addrect = 2, col = gray.colors(100))
其他選擇是使用RcolorBrewer包。
library(RColorBrewer)
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "RdBu"))
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "RdYlBu"))
corrplot(M, type = "upper", order = "hclust",
col = brewer.pal(n = 8, name = "PuOr"))
更改文本標(biāo)簽和圖例的顏色和旋轉(zhuǎn)
參數(shù)cl.*用于顏色圖例,tl.*如果用于文本圖例。對于文本標(biāo)簽,tl.col(文本標(biāo)簽顏色)和tl.srt(文本標(biāo)簽字符串旋轉(zhuǎn))用于更改文本顏色和旋轉(zhuǎn)。
這里有些例子。
## remove color legend and text legend
corrplot(M, order = "AOE", cl.pos = "n", tl.pos = "n")
## bottom color legend, diagonal text legend, rotate text label
corrplot(M, order = "AOE", cl.pos = "b", tl.pos = "d", tl.srt = 60)
## a wider color legend with numbers right aligned
corrplot(M, order = "AOE", cl.ratio = 0.2, cl.align = "r")
## text labels rotated 45 degrees
corrplot(M, type = "lower", order = "hclust", tl.col = "black", tl.srt = 45)
處理非相關(guān)矩陣
corrplot(abs(M),order = "AOE", col = col3(200), cl.lim = c(0, 1))
## visualize a matrix in [-100, 100]
ran <- round(matrix(runif(225, -100,100), 15))
corrplot(ran, is.corr = FALSE, method = "square")
## a beautiful color legend
corrplot(ran, is.corr = FALSE, method = "ellipse", cl.lim = c(-100, 100))
如果矩陣是矩形,則可以使用win.asp參數(shù)調(diào)整縱橫比, 以使矩陣呈現(xiàn)為正方形。
ran <- matrix(rnorm(70), ncol = 7)
corrplot(ran, is.corr = FALSE, win.asp = .7, method = "circle")
處理缺失(NA)值
默認(rèn)情況下,corrplot將NA值呈現(xiàn)為"?"字符。使用na.label 參數(shù),可以使用不同的值(最多支持兩個字符)。
M2 <- M
diag(M2) = NA
corrplot(M2)
corrplot(M2, na.label = "o")
corrplot(M2, na.label = "NA")
在標(biāo)簽中使用“ plotmath”表達(dá)式
從version開始0.78,可以 在變量名稱中使用 plotmath表達(dá)式。要激活plotmath渲染,前綴的人物之一的標(biāo)簽":","="或"$"。
M2 <- M[1:5,1:5]
colnames(M2) <- c("alpha", "beta", ":alpha+beta", ":a[0]", "=a[beta]")
rownames(M2) <- c("alpha", "beta", NA, "$a[0]", "$ a[beta]")
corrplot(M2)
將相關(guān)圖與顯著性檢驗相結(jié)合
res1 <- cor.mtest(mtcars, conf.level = .95)
res2 <- cor.mtest(mtcars, conf.level = .99)
## specialized the insignificant value according to the significant level
corrplot(M, p.mat = res1$p, sig.level = .2)
corrplot(M, p.mat = res1$p, sig.level = .05)
corrplot(M, p.mat = res1$p, sig.level = .01)
## leave blank on no significant coefficient
corrplot(M, p.mat = res1$p, insig = "blank")
## add p-values on no significant coefficient
corrplot(M, p.mat = res1$p, insig = "p-value")
## add all p-values
corrplot(M, p.mat = res1$p, insig = "p-value", sig.level = -1)
## add cross on no significant coefficient
corrplot(M, p.mat = res1$p, order = "hclust", insig = "pch", addrect = 3)
可視化置信區(qū)間
corrplot(M, low = res1$lowCI, upp = res1$uppCI, order = "hclust",
rect.col = "navy", plotC = "rect", cl.pos = "n")
corrplot(M, p.mat = res1$p, low = res1$lowCI, upp = res1$uppCI,
order = "hclust", pch.col = "red", sig.level = 0.01,
addrect = 3, rect.col = "navy", plotC = "rect", cl.pos = "n")
res1 <- cor.mtest(mtcars, conf.level = .95)
corrplot(M, p.mat = res1$p, insig = "label_sig",
sig.level = c(.001, .01, .05), pch.cex = .9, pch.col = "white")
corrplot(M, p.mat = res1$p, method = "color",
insig = "label_sig", pch.col = "white")
corrplot(M, p.mat = res1$p, method = "color", type = "upper",
sig.level = c(.001, .01, .05), pch.cex = .9,
insig = "label_sig", pch.col = "white", order = "AOE")
corrplot(M, p.mat = res1$p, insig = "label_sig", pch.col = "white",
pch = "p<.05", pch.cex = .5, order = "AOE")
自定義相關(guān)圖
# matrix of the p-value of the correlation
p.mat <- cor.mtest(mtcars)$p
head(p.mat[, 1:5])
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07 1.776240e-05
## [2,] 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09 8.244636e-06
## [3,] 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08 5.282022e-06
## [4,] 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00 9.988772e-03
## [5,] 1.776240e-05 8.244636e-06 5.282022e-06 9.988772e-03 0.000000e+00
## [6,] 1.293959e-10 1.217567e-07 1.222320e-11 4.145827e-05 4.784260e-06
# Specialized the insignificant value according to the significant level
corrplot(M, type = "upper", order = "hclust",
p.mat = p.mat, sig.level = 0.01)
# Leave blank on no significant coefficient
corrplot(M, type = "upper", order = "hclust",
p.mat = p.mat, sig.level = 0.01, insig = "blank")
在上圖中,p值> 0.01的相關(guān)被認(rèn)為是無關(guān)緊要的。在這種情況下,相關(guān)系數(shù)值留為空白或添加叉號。
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(M, method = "color", col = col(200),
type = "upper", order = "hclust", number.cex = .7,
addCoef.col = "black", # Add coefficient of correlation
tl.col = "black", tl.srt = 90, # Text label color and rotation
# Combine with significance
p.mat = p.mat, sig.level = 0.01, insig = "blank",
# hide correlation coefficient on the principal diagonal
diag = FALSE)
探索大型功能矩陣
# generating large feature matrix (cols=features, rows=samples)
num_features <- 60 # how many features
num_samples <- 300 # how many samples
DATASET <- matrix(runif(num_features * num_samples),
nrow = num_samples, ncol = num_features)
# setting some dummy names for the features e.g. f23
colnames(DATASET) <- paste0("f", 1:ncol(DATASET))
# let's make 30% of all features to be correlated with feature "f1"
num_feat_corr <- num_features * .3
idx_correlated_features <- as.integer(seq(from = 1,
to = num_features,
length.out = num_feat_corr))[-1]
for (i in idx_correlated_features) {
DATASET[,i] <- DATASET[,1] + runif(num_samples) # adding some noise
}
corrplot(cor(DATASET), diag = FALSE, order = "FPC",
tl.pos = "td", tl.cex = 0.5, method = "color", type = "upper")
生活很好,等你超越