非度量多維尺度法(NMDS)是一種將多維空間的研究對象(樣本或變量)簡化到低維空間進(jìn)行定位、分析和歸類,同時(shí)又保留對象間原始關(guān)系的數(shù)據(jù)分析方法。
適用于無法獲得研究對象間精確的相似性或相異性數(shù)據(jù),僅能得到他們之間等級關(guān)系數(shù)據(jù)的情形。換句話說,當(dāng)資料不適合直接進(jìn)行變量型多維尺度分析時(shí),對其進(jìn)行變量變換,再采用變量型多維尺度分析。其特點(diǎn)是根據(jù)樣品中包含的物種信息,以點(diǎn)的形式反映在多維空間上,而對不同樣品間的差異程度,則是通過點(diǎn)與點(diǎn)間的距離體現(xiàn)的,最終獲得樣品的空間定位點(diǎn)圖;
另外,針對分離度不好的樣品,spider plot 可能會(huì)有奇效。
示例

Non-metric multidimensional scaling (NMDS) plot of β-diversity based on Bray-Curtis dissimilarities in the bacteriome of the piglet GI tract. Ellipses indicate 1 standard deviation from organ centroid and spiders are drawn to GI tract region centroid. Colors indicate GI tract region, symbols indicate litter, and ellipses line types indicate specific organs of the upper and lower GI.
腳本
數(shù)據(jù)樣式:
- OTU豐度數(shù)據(jù)就是一般OTU表或注釋后的OTU豐度表,每一行為一個(gè)OTU,每一列為一個(gè)樣品。
- 分組數(shù)據(jù)為跟樣品一一對應(yīng)的分組數(shù)據(jù)。
Notice:利用OTU表做NMDS時(shí),可以獲得(樣本+物種)兩種得分信息,能夠找到更多的潛在信息。
library(vegan)
library(ggplot2)
# data --------------------------------------------------------------------
set.seed(13)
otu <- matrix(sample(c(0:200), 1200, replace = TRUE),
ncol = 12, nrow = 100,
dimnames = list(
row_names = paste0("OTU", seq(1:100)),
col_names = paste0("sample", seq(1:12))
))
groups <- data.frame(
sample = paste0("sample", seq(1:12)),
sites = rep(c("root", "soil", "rhizosphere"), 4)
)
# hellinger-transform: square root of method = "total"
otu <- t(otu)
hell_otu <- decostand(otu, "hell")
# The number of points n should be n > 2*k + 1
# default k = 2
NMDS <- metaMDS(hell_otu, k = 4, distance = "bray")
# print NMDS
# stress 應(yīng)該越小越好,通常閾值為0.2
NMDS
# Get Species or Site Scores from an Ordination
score_species <- scores(NMDS, display = "species")
score_sites <- scores(NMDS, display = "sites")
# get the stress value
stress <- round(NMDS$stress, 4)
# adds group date
# 有時(shí)groups中的sample 和score 的結(jié)果順序不一樣
# 推薦使用merge 或者 match 函數(shù)合并數(shù)據(jù)
plot_data <- cbind(as.data.frame(score_sites), groups)
# set data for spider plot ------------------------------------------------
# calculate the center of NMDS1, NMDS2, according to groups
cent <- aggregate(cbind(NMDS1, NMDS2) ~ sites,
data = plot_data, FUN = mean)
# summarise_if 有利于自動(dòng)化
plot_data %>%
group_by(sites) %>%
summarise_if(is.numeric, mean)
cent <- setNames(cent, c("sites", "oNMDS1", "oNMDS2"))
spider_data <- merge(plot_data, cent, by = "sites", sort = FALSE)
# 設(shè)置坐標(biāo)軸刻度label
theme <- function(){
list(scale_x_continuous(breaks = seq(from = -0.1,
to = 0.1,
by = 0.05),
labels = seq(from = -0.1,
to = 0.1,
by = 0.05)),
scale_y_continuous(breaks = seq(from = -0.1,
to = 0.1,
by = 0.05),
labels = seq(from = -0.1,
to = 0.1,
by = 0.05)))
}
# 可視化
ggplot(plot_data, aes(x = NMDS1, y = NMDS2,
color = sites)) +
geom_point() +
coord_fixed() +
stat_ellipse()
ggplot(spider_data, aes(x = NMDS1, y = NMDS2,
color = sites)) +
geom_segment(aes(xend = oNMDS1, yend = oNMDS2)) +
geom_point() +
geom_point()
NMDS結(jié)果評估
通常情況下我們可以根據(jù)應(yīng)力值來判斷排序模型的合理性,應(yīng)力值(Stress)最好不要大于0.2。
此外,還可以通過goodness()和stressplot() {vegan}來評估NMDS擬合優(yōu)度。
# Shepard圖: 若R2越大,則NMDS結(jié)果越合理
stressplot(NMDS, main = "Shepard Plot")
gof <- goodness(NMDS1)
plot(NMDS, display = "sites", type = "t", main = "goodness of fit statistic")
points(NMDS, cex = gof * 200, col = "red")

Reference
Arfken AM, Frey JF, Ramsay TG, Summers KL. Yeasts of Burden: Exploring the Mycobiome-Bacteriome of the Piglet GI Tract. Front Microbiol. 2019 Oct 8;10:2286. doi: 10.3389/fmicb.2019.02286