多細(xì)胞生物的細(xì)胞與細(xì)胞之間往往會通過細(xì)胞因子和膜蛋白等進(jìn)行通訊,從而調(diào)節(jié)生命活動,保證生命體高效、有序的運(yùn)作。其中,受體-配體介導(dǎo)的細(xì)胞間通訊對協(xié)調(diào)發(fā)育、分化和疾病等多種生物學(xué)過程至關(guān)重要。細(xì)胞通訊分析主要通過統(tǒng)計不同細(xì)胞類型中受體和配體的表達(dá)及配對情況,結(jié)合分子信息數(shù)據(jù)庫推斷不同細(xì)胞之間的相互作用。因此,利用單細(xì)胞轉(zhuǎn)錄組數(shù)據(jù)分析的細(xì)胞通訊,僅限于蛋白質(zhì)配體-受體復(fù)合物介導(dǎo)的細(xì)胞間通訊,其分析的基礎(chǔ)是基因表達(dá)數(shù)據(jù)和配體-受體數(shù)據(jù)庫。例如轉(zhuǎn)錄組數(shù)據(jù)表明A細(xì)胞和B細(xì)胞分別表達(dá)了基因α和β,通過數(shù)據(jù)庫查詢α和β是配體-受體關(guān)系,則認(rèn)為A-B通過α-β途徑進(jìn)行了通訊。
一、細(xì)胞通訊分析常用方法
基于單細(xì)胞數(shù)據(jù)進(jìn)行的細(xì)胞通訊分析往往涉及到表達(dá)量矩陣,受體-配體表達(dá),受體-配體互作強(qiáng)度等信息。常見的單細(xì)胞轉(zhuǎn)錄組細(xì)胞通訊分析工具有CellPhoneDB、celltalker和iTALK等,最權(quán)威的可能屬CellPhoneDB,比較常用。接下來,本期內(nèi)容將針對這三種常見的細(xì)胞通訊工具做一簡單介紹。
1、CellPhoneDB
CellPhoneDB主要通過一種細(xì)胞類型的受體和另一種細(xì)胞類型的配體的表達(dá)信息,得到不同細(xì)胞類型之間的相互作用。CellPhoneDB獨特之處是不僅包含了數(shù)據(jù)庫注釋的受體和配體,還提供了人工注釋的參與細(xì)胞通訊的蛋白質(zhì)家族,提供受體和配體的亞基結(jié)構(gòu),這是大多數(shù)據(jù)庫和研究沒有涉及的,這點對于準(zhǔn)確研究多亞基復(fù)合物介導(dǎo)的細(xì)胞通訊很關(guān)鍵,可以對細(xì)胞間的通訊分子進(jìn)行全面、系統(tǒng)的分析。CellPhoneDB數(shù)據(jù)庫概況如下圖所示:

CellPhoneDB具體分析流程如圖:

2、Celltalker
Celltalker的分析采納的數(shù)據(jù)來自Ramilowski et al (Nature Communications, 2015)的研究成果,并且允許用戶自己添加配對信息。celltalker分析細(xì)胞通訊的特點是更傾向于分析樣本之間具有差異的細(xì)胞通訊關(guān)系,其構(gòu)建celltalker對象的函數(shù)也要求輸入分組信息。為了保證分析的可靠性,它要求一個分組要有幾個重復(fù)樣本。
celltalker認(rèn)定細(xì)胞進(jìn)行通訊的前提是配體和受體的表達(dá)值在通訊的細(xì)胞之間具有一致性。
celltalker包的數(shù)據(jù)提取和圖形調(diào)整相對復(fù)雜,有耐心的可以細(xì)調(diào),本人不建議使用此包!
library(Seurat)
library(tidyverse)
library(celltalker)
scRNA <- readRDS("scRNA.rds")
##調(diào)整scRNA的meta.data,方便滿足celltalker的數(shù)據(jù)要求
cell_meta = scRNA@meta.data[,1:9]
names(cell_meta)[which(names(cell_meta)=='orig.ident')] <- "sample.id"
names(cell_meta)[which(names(cell_meta)=='celltype_Monaco')] <- "cell.type"
cell_meta$sample.type <- cell_meta$sample.id
cell_meta[grep("^HNC[0-9]+PBMC", cell_meta$sample.type), "sample.type"] <- "HNC_PBMC"
cell_meta[grep("^HNC[0-9]+TIL", cell_meta$sample.type), "sample.type"] <- "HNC_TIL"
cell_meta[grep("^PBMC", cell_meta$sample.type), "sample.type"] <- "PBMC"
cell_meta[grep("^Tonsil", cell_meta$sample.type), "sample.type"] <- "Tonsil"
scRNA@meta.data <- cell_meta
##提取數(shù)據(jù)子集
cellsub = rownames(subset(cell_meta, sample.type=="HNC_TIL"|sample.type=="Tonsil"))
scRNA <- scRNA[,cellsub]
##鑒定scRNA中存在的配體受體
ligs <- as.character(unique(ramilowski_pairs$ligand))
recs <- as.character(unique(ramilowski_pairs$receptor))
ligs.present <- rownames(scRNA)[rownames(scRNA) %in% ligs]
recs.present <- rownames(scRNA)[rownames(scRNA) %in% recs]
genes.to.use <- union(ligs.present,recs.present)
##鑒定樣本分組之間差異表達(dá)的配體受體
Idents(scRNA) <- "sample.type"
markers <- FindAllMarkers(scRNA,assay="RNA",features=genes.to.use,only.pos=TRUE)
ligs.recs.use <- unique(markers$gene)
##保留差異表達(dá)的配體-受體列表
interactions.forward1 <- ramilowski_pairs[as.character(ramilowski_pairs$ligand) %in% ligs.recs.use,]
interactions.forward2 <- ramilowski_pairs[as.character(ramilowski_pairs$receptor) %in% ligs.recs.use,]
interact.for <- rbind(interactions.forward1,interactions.forward2)
##準(zhǔn)備celltalker的輸入文件
expr.mat <- GetAssayData(scRNA, slot="counts")
defined.clusters <- scRNA@meta.data$cell.type
defined.groups <- scRNA@meta.data$sample.type
defined.replicates <- scRNA@meta.data$sample.id
##重構(gòu)數(shù)據(jù),為每個樣本分配相應(yīng)的表達(dá)矩陣
reshaped.matrices <- reshape_matrices(count.matrix=expr.mat, clusters=defined.clusters,
groups=defined.groups, replicates=defined.replicates, ligands.and.receptors=interact.for)
##分析表達(dá)一致性的配體和受體。
# cells.reqd=10:每個分組中至少有10個細(xì)胞表達(dá)了配體/受體
# freq.pos.reqd=0.5:至少有50%的樣本表達(dá)的配體/受體
consistent.lig.recs <- create_lig_rec_tib(exp.tib=reshaped.matrices,
clusters=defined.clusters,
groups=defined.groups,
replicates=defined.replicates,
cells.reqd=10,
freq.pos.reqd=0.5,
ligands.and.receptors=interact.for)
##生成Celltalker對象,并繪制circos圈圖
# freq.group.in.cluster: 只對細(xì)胞數(shù)>總細(xì)胞數(shù)5%的細(xì)胞類型分析
put.int<-putative_interactions(ligand.receptor.tibble=consistent.lig.recs,
clusters=defined.clusters,
groups=defined.groups,
freq.group.in.cluster=0.05,
ligands.and.receptors=interact.for)
##鑒定分組間特異出現(xiàn)的配體/受體并作圖
unique.ints <- unique_interactions(put.int, group1="HNC_TIL", group2="Tonsil", interact.for)
#HNC_TIL組作圖
HNC_TIL <- pull(unique.ints[1,2])[[1]]
circos.HNC_TIL <- pull(put.int[1,2])[[1]][HNC_TIL]
par(MAR=c(1,1,1,1))
circos_plot(interactions=circos.HNC_TIL, clusters=defined.clusters)
#Tonsil組作圖
Tonsil <- pull(unique.ints[2,2])[[1]]
circos.Tonsil <- pull(put.int[2,2])[[1]][Tonsil]
circos_plot(interactions=circos.Tonsil, clusters=defined.clusters)
3、iTALK
iTALK這個包的應(yīng)用比較簡單,可以自定義定制的配體-受體數(shù)據(jù)庫。默認(rèn)數(shù)據(jù)庫分析物種為人,如果我們做的事其他物種,可以匹配人的同源基因。它的優(yōu)點還是集成了多種差異分析和可視化方法。將配體-受體注釋為細(xì)胞因子、生長因子、免疫檢查點和其他類。配體-受體分析的中間數(shù)據(jù)和最終結(jié)果都可以輕松導(dǎo)出,我沒有泡跑自己的數(shù)據(jù),也不常用這個包,所以以官方教程為主。感興趣的可以去官方查看教程教程:https://github.com/Coolgenome/iTALK/tree/master/example

# This example data is from 10x pbmc dataset. Samples are randomly selected from each cell type. And groups are randomly assigned to each sample to make the illustration.
library(iTALK)
# ====================準(zhǔn)備數(shù)據(jù)===================
data<-read.table('example_data.txt',sep='\t',header=T,stringsAsFactors = F)
## highly expressed ligand-receptor pairs
# find top 50 percent highly expressed genes
highly_exprs_genes<-rawParse(data,top_genes=50,stats='mean')
# find the ligand-receptor pairs from highly expressed genes
comm_list<-c('growth factor','other','cytokine','checkpoint')
cell_col<-structure(c('#4a84ad','#4a1dc6','#e874bf','#b79eed', '#ff636b', '#52c63b','#9ef49a'),names=unique(data$cell_type))
par(mfrow=c(1,2))
res<-NULL
for(comm_type in comm_list){
res_cat<-FindLR(highly_exprs_genes,datatype='mean count',comm_type=comm_type)
res_cat<-res_cat[order(res_cat$cell_from_mean_exprs*res_cat$cell_to_mean_exprs,decreasing=T),]
#plot by ligand category
#overall network plot
NetView(res_cat,col=cell_col,vertex.label.cex=1,arrow.width=1,edge.max.width=5)
#top 20 ligand-receptor pairs
LRPlot(res_cat[1:20,],datatype='mean count',
cell_col=cell_col,
link.arr.lwd=res_cat$cell_from_mean_exprs[1:20],
link.arr.width=res_cat$cell_to_mean_exprs[1:20])
title(comm_type)
res<-rbind(res,res_cat)
}
res<-res[order(res$cell_from_mean_exprs*res$cell_to_mean_exprs,decreasing=T),][1:20,]
NetView(res,col=cell_col,vertex.label.cex=1,arrow.width=1,edge.max.width=5)
LRPlot(res[1:20,],datatype='mean count',cell_col=cell_col,link.arr.lwd=res$cell_from_mean_exprs[1:20],link.arr.width=res$cell_to_mean_exprs[1:20])
## significant ligand-receptor pairs between compare groups
# randomly assign the compare group to each sample
data<-data %>% mutate(compare_group=sample(2,nrow(data),replace=TRUE))
# find DEGenes of regulatory T cells and NK cells between these 2 groups
deg_t<-DEG(data %>% filter(cell_type=='regulatory_t'),method='Wilcox',contrast=c(2,1))
deg_nk<-DEG(data %>% filter(cell_type=='cd56_nk'),method='Wilcox',contrast=c(2,1))
# find significant ligand-receptor pairs and do the plotting
par(mfrow=c(1,2))
res<-NULL
for(comm_type in comm_list){
res_cat<-FindLR(deg_t,deg_nk,datatype='DEG',comm_type=comm_type)
res_cat<-res_cat[order(res_cat$cell_from_logFC*res_cat$cell_to_logFC,decreasing=T),]
#plot by ligand category
if(nrow(res_cat)==0){
next
}else if(nrow(res_cat>=20)){
LRPlot(res_cat[1:20,],datatype='DEG',cell_col=cell_col,link.arr.lwd=res_cat$cell_from_logFC[1:20],link.arr.width=res_cat$cell_to_logFC[1:20])
}else{
LRPlot(res_cat,datatype='DEG',cell_col=cell_col,link.arr.lwd=res_cat$cell_from_logFC,link.arr.width=res_cat$cell_to_logFC)
}
NetView(res_cat,col=cell_col,vertex.label.cex=1,arrow.width=1,edge.max.width=5)
title(comm_type)
res<-rbind(res,res_cat)
}
if(is.null(res)){
print('No significant pairs found')
}else if(nrow(res)>=20){
res<-res[order(res$cell_from_logFC*res$cell_to_logFC,decreasing=T),][1:20,]
NetView(res,col=cell_col,vertex.label.cex=1,arrow.width=1,edge.max.width=5)
LRPlot(res[1:20,],datatype='DEG',cell_col=cell_col,link.arr.lwd=res$cell_from_logFC[1:20],link.arr.width=res$cell_to_logFC[1:20])
}else{
NetView(res,col=cell_col,vertex.label.cex=1,arrow.width=1,edge.max.width=5)
LRPlot(res,datatype='DEG',cell_col=cell_col,link.arr.lwd=res$cell_from_logFC,link.arr.width=res$cell_to_logFC)
}
# I just randomly assigned the compare group to samples which has no biological difference for showing how to use the package.
# So there should be no significant genes to be expected.

參考文獻(xiàn)
[1] Efremova, M., Vento-Tormo, M., Teichmann, S. A., & Vento-Tormo, R. (2020). CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nature Protocols. doi:10.1038/s41596-020-0292-x
[2] Ramilowski, J. A., Goldberg, T., Harshbarger, J., Kloppmann, E., Lizio, M., Satagopam, V. P., … Forrest, A. R. R. (2015). A draft network of ligand–receptor-mediated multicellular signalling in human. Nature Communications, 6(1).doi:10.1038/ncomms8866
[3]iTALK: an R Package to Characterize and Illustrate Intercellular Communication