空間轉(zhuǎn)錄組分析---SPOTlight(1.0.1版本)

空間轉(zhuǎn)錄組+單細(xì)胞聯(lián)合分析---(SPOTlight最新版教程)

作者在2022年更新了SPOTlight腳本代碼,或許有伙伴在運(yùn)行以往的版本教程(https://marcelosua.github.io/)spotlight_deconvolution函數(shù)的時(shí)候會(huì)出現(xiàn)報(bào)錯(cuò):could not find function "spotlight_deconvolution",這就說明版本更新了,以往的函數(shù)代碼作者也已經(jīng)更新。

什么是SPOTlight?

由于10x Visium數(shù)據(jù)初衷是想將每個(gè)spot看作一個(gè)細(xì)胞,但是現(xiàn)實(shí)實(shí)驗(yàn)中單個(gè)spot中包含不僅是一個(gè)細(xì)胞。如何確定每個(gè)spot中包含的細(xì)胞,有利于空間轉(zhuǎn)錄組的分析。
SPOTlight 是一個(gè)工具,能夠解讀每個(gè)捕獲位置細(xì)胞混合物內(nèi)存在的細(xì)胞類型和比例,最初是為10X的Visium-空間轉(zhuǎn)錄組學(xué)技術(shù)而開發(fā)。
SPOTlight可以結(jié)合單細(xì)胞轉(zhuǎn)錄組RNA測序信息反卷積deconvolute空間數(shù)據(jù),識(shí)別每個(gè)spot中的細(xì)胞類型和比例。SPOTlight利用這兩種數(shù)據(jù)類型的優(yōu)勢,能夠?qū)T與scRNA-seq數(shù)據(jù)集成,從而推斷出復(fù)雜組織中細(xì)胞類型和狀態(tài)的位置。SPOTlight基于一個(gè)種子的非負(fù)矩陣因子分解回歸(Seeded NMF regression ),使用細(xì)胞類型標(biāo)記基因和非負(fù)最小二乘(NNLS)初始化,隨后去卷積ST捕獲位置(spot)。作者文章:Elosua-Bayes M, Nieto P, Mereu E, Gut I, Heyn H (2021): SPOTlight: seeded NMF regression to deconvolute spatial transcriptomics spots with single-cell transcriptomes. Nucleic Acids Res 49(9):e50. doi:10.1093/nar/gkab043.
研究流程圖如下:

image.png

研究代碼如下:
首先安裝SPOTlight包

install.packages("BiocManager")
BiocManager::install("SPOTlight")
# Or the devel version
BiocManager::install("SPOTlight", version = "devel")
#或者從Github中安裝
install.packages("devtools")
library(devtools)
install_github("https://github.com/MarcElosua/SPOTlight")

加載包開始分析

library(ggplot2)
library(SPOTlight)
library(SingleCellExperiment)
library(SpatialExperiment)
library(scater)
library(scran)
library(scuttle)
library(Matrix)
library(data.table)
library(Seurat)
library(SeuratData)
library(dplyr)
library(gt)
library(igraph)
library(RColorBrewer)
## Loading the spatial and single cell data:
spe <- readRDS("path:/scRNA")
sce <- readRDS("path:/scRNA.rds") ##單細(xì)胞數(shù)據(jù)經(jīng)scater處理的,對(duì)接logNormCounts
##代碼空轉(zhuǎn)和單細(xì)胞演示數(shù)據(jù)是作者從Biocondcutor packages下載
library(TENxVisiumData) ##空轉(zhuǎn)數(shù)據(jù)
spe <- MouseKidneyCoronal()
# Use symbols instead of Ensembl IDs as feature names
rownames(spe) <- rowData(spe)$symbol
##單細(xì)胞數(shù)據(jù)集來自:[Tabula Muris Sensis](https://www.nature.com/articles/s41586-020-2496-1) 
library(TabulaMurisSenisData)  ##單細(xì)胞數(shù)據(jù)
sce <- TabulaMurisSenisDroplet(tissues = "Kidney")$Kidney
##快速查看數(shù)據(jù):
table(sce$free_annotation, sce$age)
# 作者為了去除批次影響挑選特定年齡段的小鼠細(xì)胞
sce <- sce[, sce$age == "18m"]
# 挑選具有明確注釋的細(xì)胞類別,這個(gè)可以根據(jù)自身情況調(diào)整
sce <- sce[, !sce$free_annotation %in% c("nan", "CD45")]
## Preprocessing
### Feature selection(第一步獲得每種細(xì)胞類型的標(biāo)記基因,歸一化處理)
sce <- logNormCounts(sce)  
##獲得既不是核糖體也不是線粒體的基因向量
genes <- !grepl(pattern = "^Rp[l|s]|Mt", x = rownames(sce))
dec <- modelGeneVar(sce, subset.row = genes)
plot(dec$mean, dec$total, xlab = "Mean log-expression", ylab = "Variance")
curve(metadata(dec)$trend(x), col = "blue", add = TRUE)
# 獲得3000個(gè)的高變基因HVGs.
hvg <- getTopHVGs(dec, n = 3000)
colLabels(sce) <- colData(sce)$celltype
# 計(jì)算標(biāo)記基因,表明該基因?qū)υ摷?xì)胞類型的重要性
mgs <- scoreMarkers(sce, subset.row = genes)
mgs_fil <- lapply(names(mgs), function(i) {
    x <- mgs[[i]]
    # 篩選并保留相關(guān)標(biāo)記基因,即AUC>0.8的基因
    x <- x[x$mean.AUC > 0.8, ]
    # 將基因從高到低的權(quán)重排序
    x <- x[order(x$mean.AUC, decreasing = TRUE), ]
    # 將基因和聚類ID添加到數(shù)據(jù)框中
    x$gene <- rownames(x)
    x$cluster <- i
    data.frame(x)
})
mgs_df <- do.call(rbind, mgs_fil)
##Cell Downsampling(每個(gè)細(xì)胞類別最多隨機(jī)選擇100個(gè)細(xì)胞,<100個(gè)細(xì)胞都將被使用。轉(zhuǎn)錄譜明顯不同的(如B和T),可使用較少N的細(xì)胞;轉(zhuǎn)錄譜相似的細(xì)胞增加N)
# 按細(xì)胞類別分割細(xì)胞索引
idx <- split(seq(ncol(sce)), sce$celltype)
# 每個(gè)類別和亞群最多降樣到20個(gè)
# 我們?cè)谶@里使用5個(gè)來加快進(jìn)程,但對(duì)于你的實(shí)際情況,可以設(shè)置為75-100個(gè)。
# 生活模式分析
n_cells <- 5
cs_keep <- lapply(idx, function(i) {
    n <- length(i)
    if (n < n_cells)
        n_cells <- n
    sample(i, n_cells)
})
sce <- sce[, unlist(cs_keep)]
## Deconvolution(運(yùn)行 "SPOTlight 函數(shù)"來解構(gòu)spot)
res <- SPOTlight(
    x = sce,
    y = spe,
    groups = as.character(sce$free_annotation),
    mgs = mgs_df,
    hvg = hvg,
    weight_id = "mean.AUC",
    group_id = "cluster",
    gene_id = "gene")
##或者分兩步運(yùn)行`SPOTlight'得到訓(xùn)練好的模型,在其他數(shù)據(jù)集上可重復(fù)使用。
mod_ls <- trainNMF(
    x = sce,
    y = spe,
    groups = sce$type,
    mgs = mgs,
    weight_id = "weight",
    group_id = "type",
    gene_id = "gene")
 # Run deconvolution(運(yùn)行去卷積)
res <- runDeconvolution(
    x = spe,
    mod = mod_ls[["mod"]],
    ref = mod_ls[["topic"]])

##從`SPOTlight`中提取數(shù)據(jù)
# 提取去卷積矩陣
head(mat <- res$mat)[, seq_len(3)]
# 提取NMF模型擬合
mod <- res$NMF
## 可視化過程
## Topic profiles(設(shè)置`facet = FALSE`,topic profile表征細(xì)胞類別,每個(gè)細(xì)胞類別都有一個(gè)獨(dú)特的主題特征)
plotTopicProfiles(
    x = mod,
    y = sce$free_annotation,
    facet = FALSE,
    min_prop = 0.01,
    ncol = 1) +
    theme(aspect.ratio = 1)
##確保所有來自同一細(xì)胞類別的細(xì)胞都有類似的topic profile
plotTopicProfiles(
    x = mod,
    y = sce$free_annotation,
    facet = TRUE,
    min_prop = 0.01,
    ncol = 6)
##查看模型為每個(gè)主題學(xué)習(xí)了哪些基因,更高的數(shù)值表明該基因與該主題更相關(guān)
library(NMF)
sign <- basis(mod)
colnames(sign) <- paste0("Topic", seq_len(ncol(sign)))
head(sign)
# 這可以用DT動(dòng)態(tài)可視化,如下所示
# DT::datatable(sign, fillContainer = TRUE, filter = "top")
## Spatial Correlation Matrix(空間相關(guān)矩陣)
#參照(http://www.sthda.com/english/wiki/ggcorrplot-visualization-of-a-correlation-matrix-using-ggplot2)獲得參數(shù)
plotCorrelationMatrix(mat)
## Co-localization(空間共定位及相互作用)
#參照(https://www.r-graph-gallery.com/network.html) 獲取參數(shù).
plotInteractions(mat, which = "heatmap", metric = "prop")
plotInteractions(mat, which = "heatmap", metric = "jaccard")
plotInteractions(mat, which = "network")
## Scatterpie(將細(xì)胞類型的比例可視化為餅狀圖)
ct <- colnames(mat)
mat[mat < 0.1] <- 0
# 定義調(diào)色板
# 使用'colorBlindness'軟件包中的'paletteMartin'
paletteMartin <- c(
    "#000000", "#004949", "#009292", "#ff6db6", "#ffb6db", 
    "#490092", "#006ddb", "#b66dff", "#6db6ff", "#b6dbff", 
    "#920000", "#924900", "#db6d00", "#24ff24", "#ffff6d")
pal <- colorRampPalette(paletteMartin)(length(ct))
names(pal) <- ct
plotSpatialScatterpie(
    x = spe,
    y = mat,
    cell_types = colnames(mat),
    img = FALSE,
    scatterpie_alpha = 1,
    pie_scale = 0.4) +
    scale_fill_manual(
        values = pal,
        breaks = names(pal))
##在圖像下方--將其逆時(shí)針旋轉(zhuǎn)90度,并在橫軸上做鏡像,以顯示如果spot不在圖像上覆蓋該如何對(duì)齊
plotSpatialScatterpie(
    x = spe,
    y = mat,
    cell_types = colnames(mat),
    img = FALSE,
    scatterpie_alpha = 1,
    pie_scale = 0.4, 
    #  將圖像逆時(shí)針旋轉(zhuǎn)90度
    degrees = -90,
     # 將圖像在其X軸上進(jìn)行旋轉(zhuǎn)
    axis = "h") +
    scale_fill_manual(
        values = pal,
        breaks = names(pal))
## Residuals
##最后觀察模型對(duì)每個(gè)點(diǎn)的比例的預(yù)測情況,通過觀察每個(gè)斑點(diǎn)的平方之和的殘差來實(shí)現(xiàn),這表明模型不能解釋的生物信號(hào)的數(shù)量
spe$res_ss <- res[[2]][colnames(spe)]
xy <- spatialCoords(spe)
spe$x <- xy[, 1]
spe$y <- xy[, 2]
ggcells(spe, aes(x, y, color = res_ss)) +
    geom_point() +
    scale_color_viridis_c() +
    coord_fixed() +
    theme_bw()
# Session information(查看目前相關(guān)包的版本信息)
sessionInfo()

其中可能會(huì)有代碼報(bào)錯(cuò),難點(diǎn)是在單細(xì)胞數(shù)據(jù)的準(zhǔn)備和各個(gè)軟件的版本,如果報(bào)錯(cuò)請(qǐng)查看數(shù)據(jù)格式是否與作者所演示的數(shù)據(jù)集格式相符。最后附上Github網(wǎng)站參考:(https://github.com/MarcElosua/SPOTlight

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容