單細(xì)胞36計(jì)之2圍魏救趙---單細(xì)胞轉(zhuǎn)錄組的整合方法

2、圍魏救趙
本指圍攻魏國的都城以解救趙國?,F(xiàn)借指用包超敵人的后方來迫使它撤兵的戰(zhàn)術(shù)。
進(jìn)攻兵力集中、實(shí)力強(qiáng)大的敵軍,不如使強(qiáng)大的敵軍分散減弱了再攻擊。攻擊敵軍的強(qiáng)盛部位,不如攻擊敵軍的薄弱部份來得有效。
所謂圍魏救趙,是指當(dāng)敵人實(shí)力強(qiáng)大時(shí),要避免和強(qiáng)敵正面決戰(zhàn),應(yīng)該采取迂回戰(zhàn)術(shù),迫使敵人分散兵力,然后抓住敵人的薄弱環(huán)節(jié)發(fā)動攻擊,致敵于死地。

一、概念


image.png

參考(reference):將跨個體,跨技術(shù),跨模式產(chǎn)生的不同的單細(xì)胞數(shù)據(jù)整合后的數(shù)據(jù)集 。也就是將不同來源的數(shù)據(jù)集組合到同一空間(reference)中。 從廣義上講,在概念上類似于基因組DNA序列的參考裝配。

查詢(query):單個轉(zhuǎn)錄組實(shí)驗(yàn)產(chǎn)生的數(shù)據(jù)集

轉(zhuǎn)化學(xué)習(xí)(transfer learning):產(chǎn)生一個于參考數(shù)據(jù)集(reference)上進(jìn)行訓(xùn)練的模型,可以將信息再重新投影到query 數(shù)據(jù)集上

錨定:由一組共同的分子特征定義的兩個細(xì)胞(每個數(shù)據(jù)集一個),將對應(yīng)關(guān)系表示錨定。將得到的一對細(xì)胞為錨點(diǎn),它們編碼的跨數(shù)據(jù)集的細(xì)胞關(guān)系,將構(gòu)成所有后續(xù)整合分析的基礎(chǔ)。

二、標(biāo)準(zhǔn)流程
1.安裝數(shù)據(jù)集

library(Seurat)
library(SeuratData)
InstallData("panc8")
這里如果長時(shí)間下載不了,嘗試以下的方法:

待下載完成,解壓,將標(biāo)注文件復(fù)制出來


image.png

復(fù)制到R環(huán)境的庫目錄,比如我的是:E:RR-3.6.1librarySeuratDatadata

  1. 數(shù)據(jù)預(yù)處理

rm(list = ls())
options(stringsAsFactors = F)
library(Seurat)
library(SeuratData)
data("panc8")
pancreas.list <- SplitObject(panc8, split.by = "tech")
pancreas.list <- pancreas.list[c("celseq", "celseq2", "fluidigmc1", "smartseq2")]

先對數(shù)據(jù)集進(jìn)行歸一化,并為每個識別位點(diǎn)確定可變特征。

特征選擇方法使用variance stabilizing transformation ("vst")

for (i in 1:length(pancreas.list)) {
pancreas.list[[i]] <- NormalizeData(pancreas.list[[i]], verbose = FALSE)
pancreas.list[[i]] <- FindVariableFeatures(pancreas.list[[i]], selection.method = "vst",
nfeatures = 2000, verbose = FALSE)
}
3.整合數(shù)據(jù)集

整合3種測序方法的胰島細(xì)胞數(shù)據(jù)集

reference.list <- pancreas.list[c("celseq", "celseq2", "smartseq2")]

識別錨點(diǎn)

這里選的維度是30,作者建議可以在10-50間調(diào)試

pancreas.anchors <- FindIntegrationAnchors(object.list = reference.list, dims = 1:30)

進(jìn)行數(shù)據(jù)集整合

已經(jīng)整合后的表達(dá)矩陣存儲在Assay中,未處理的表達(dá)舉證在RNA對象中

pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, dims = 1:30)
4.可視化

library(ggplot2)
library(cowplot)
DefaultAssay(pancreas.integrated) <- "integrated"
pancreas.integrated <- ScaleData(pancreas.integrated, verbose = FALSE)
pancreas.integrated <- RunPCA(pancreas.integrated, npcs = 30, verbose = FALSE)
pancreas.integrated <- RunUMAP(pancreas.integrated, reduction = "pca", dims = 1:30)
p1 <- DimPlot(pancreas.integrated, reduction = "umap", group.by = "tech")
p2 <- DimPlot(pancreas.integrated, reduction = "umap", group.by = "celltype", label = TRUE,
repel = TRUE) + NoLegend()
plot_grid(p1, p2)


image.png

5.使用裝配參考數(shù)據(jù)集進(jìn)行細(xì)胞類型分類

image.png

三、SCTransform 流程
rm(list = ls())
options(stringsAsFactors = F)
library(Seurat)
library(ggplot2)
options(future.globals.maxSize = 4000 * 1024^2)
data("panc8")
數(shù)據(jù)預(yù)處理

pancreas.list <- SplitObject(panc8, split.by = "tech")
pancreas.list <- pancreas.list[c("celseq", "celseq2", "fluidigmc1", "smartseq2")]

對每個項(xiàng)目運(yùn)行SCTransform

for (i in 1:length(pancreas.list)) {
pancreas.list[[i]] <- SCTransform(pancreas.list[[i]], verbose = FALSE)
}

接下來,為下游分析選擇特征,運(yùn)行 PrepSCTIntegration, 確保已計(jì)算出所有必要的Pearson

pancreas.features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = pancreas.features, verbose = FALSE)
整合數(shù)據(jù)集

這里選擇歸一化方法為“SCT”,其他命令與標(biāo)準(zhǔn)化流程一樣

pancreas.anchors <- FindIntegrationAnchors(object.list = pancreas.list, normalization.method = "SCT",
anchor.features = pancreas.features, verbose = FALSE)
pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, normalization.method = "SCT",
verbose = FALSE)
細(xì)胞分群

pancreas.integrated <- RunPCA(pancreas.integrated, verbose = FALSE)
pancreas.integrated <- RunUMAP(pancreas.integrated, dims = 1:30)
plots <- DimPlot(pancreas.integrated, group.by = c("tech", "celltype"), combine = FALSE)
plots <- lapply(X = plots, FUN = function(x) x + theme(legend.position = "top") + guides(color = guide_legend(nrow = 3, byrow = TRUE, override.aes = list(size = 3))))
CombinePlots(plots)


image.png

四、使用另一個數(shù)據(jù)集來驗(yàn)證該流程
1.安裝數(shù)據(jù)集

InstallData("pbmcsca")
2.數(shù)據(jù)整合

data("pbmcsca")
pbmc.list <- SplitObject(pbmcsca, split.by = "Method")
for (i in names(pbmc.list)) {
pbmc.list[[i]] <- SCTransform(pbmc.list[[i]], verbose = FALSE)
}
pbmc.features <- SelectIntegrationFeatures(object.list = pbmc.list, nfeatures = 3000)
pbmc.list <- PrepSCTIntegration(object.list = pbmc.list, anchor.features = pbmc.features)
pbmc.anchors <- FindIntegrationAnchors(object.list = pbmc.list, normalization.method = "SCT",
anchor.features = pbmc.features)
pbmc.integrated <- IntegrateData(anchorset = pbmc.anchors, normalization.method = "SCT")

pbmc.integrated <- RunPCA(object = pbmc.integrated, verbose = FALSE)
pbmc.integrated <- RunUMAP(object = pbmc.integrated, dims = 1:30)
plots <- DimPlot(pbmc.integrated, group.by = c("Method", "CellType"), combine = FALSE)
plots <- lapply(X = plots, FUN = function(x) x + theme(legend.position = "top") + guides(color = guide_legend(nrow = 4,
byrow = TRUE, override.aes = list(size = 2.5))))
CombinePlots(plots)


image.png

————————————————
原文鏈接:
seurat提取表達(dá)矩陣_Seurat | 單細(xì)胞轉(zhuǎn)錄組的整合方法

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容