簡(jiǎn)介
PCoA分析,即主坐標(biāo)分析(principal co-ordinates analysis),是一種非約束性的數(shù)據(jù)降維分析方法,可用來(lái)研究樣本的相似性或差異性,與PCA分析類似;但相比于PCA,PCoA以樣本距離為整體考慮,更符合生態(tài)學(xué)數(shù)據(jù)特征,應(yīng)用也更為廣泛。
PCoA分析,首先對(duì)一系列的特征值和特征向量進(jìn)行排序,然后選擇排在前幾位的最主要特征值,經(jīng)過投影后并將其投影在坐標(biāo)系里,結(jié)果相當(dāng)于是距離矩陣的一個(gè)旋轉(zhuǎn),在低維度空間以最大限度地保留原始樣本的距離關(guān)系;相似的樣本在圖形中的距離更為接近,相異的樣本距離更遠(yuǎn)。
示例

PCoA plot showing the difference between sample groups, stratified to only samples originating from participants not receiving any topical treatment. Pvalue corresponds to Adonis PERMANOVA test. Ellipses delineate the 75% prediction areas of samples from each group.
腳本
數(shù)據(jù)樣式:
- OTU豐度數(shù)據(jù)就是一般OTU表或注釋后的OTU豐度表,每一行為一個(gè)OTU,每一列為一個(gè)樣品。
- 分組數(shù)據(jù)為跟樣品一一對(duì)應(yīng)的分組數(shù)據(jù)。
vegan包的分析結(jié)果解釋:eig記錄了PCoA排序結(jié)果中,主要排序軸的特征值(再除以特征值總和就是各軸的解釋量);points記錄了各樣本在各排序軸中的坐標(biāo)值。
library(readxl)
library(ggplot2)
library(learn)
library(patchwork)
library(tidyverse)
rm(list = ls())
file <- "C:\\Users\\...total_data\\"
genes_abundance <- read.table(file = paste0(file, "otu_table_g_relative.xls"),
header = TRUE, stringsAsFactors = FALSE)
genes_abundance <- genes_abundance[-ncol(genes_abundance)]
str(genes_abundance)
which(duplicated(genes_abundance$Taxonomy) == TRUE)
groups <- read_xls(path = paste0(file, "the_information_of_sample_site.xls"),
sheet = 3)
row.names(genes_abundance) <- genes_abundance$Taxonomy
otu <- genes_abundance[-1]
otu <- data.frame(t(otu))
head(otu)
#排序(基于 OTU 豐度表)
library(vegan)
distance <- vegdist(otu, method = 'bray')
pcoa <- cmdscale(distance, k = (nrow(otu) - 1), eig = TRUE)
# 可視化數(shù)據(jù)提取 ------------------------------------------------
# 提取樣本點(diǎn)坐標(biāo)(points記錄了各樣本在各排序軸中的坐標(biāo)值)
# 前兩軸
plot_data <- data.frame({pcoa$point})[1:2]
# 提取列名,便于后面操作。
plot_data$Sample_name <- rownames(plot_data)
names(plot_data)[1:2] <- c('PCoA1', 'PCoA2')
# eig記錄了PCoA排序結(jié)果中,主要排序軸的特征值(再除以特征值總和就是各軸的解釋量)
eig = pcoa$eig
#為樣本點(diǎn)坐標(biāo)添加分組信息
plot_data <- merge(plot_data, groups, by = 'Sample_name', all.x = TRUE)
# 繪制主標(biāo)準(zhǔn)軸的第1,2軸
ggplot(data = plot_data, aes(x=PCoA1, y=PCoA2, color=Group3)) +
geom_point(alpha=.7, size=2) +
stat_chull(fill =NA) +
labs(x=paste("PCoA 1 (", format(100 * eig[1] / sum(eig), digits=4), "%)", sep=""),
y=paste("PCoA 2 (", format(100 * eig[2] / sum(eig), digits=4), "%)", sep=""))
Reference
Ring HC, Thorsen J, Saunte DM, Lilje B, Bay L, Riis PT, Larsen N, Andersen LO, Nielsen HV, Miller IM, Bjarnsholt T, Fuursted K, Jemec GB. The Follicular Skin Microbiome in Patients With Hidradenitis Suppurativa and Healthy Controls. JAMA Dermatol. 2017 Sep 1;153(9):897-905. doi: 10.1001/jamadermatol.2017.0904.