Integrating stimulated vs. control PBMC datasets to learn cell-type-specific responses

clp

10 June, 2020

注意切換工作目錄（文件夾5）

Reference

https://satijalab.org/seurat/v3.1/immune_alignment.html
https://satijalab.org/seurat/v3.1/integration.html#sctransform
https://github.com/hbctraining/scRNA-seq/blob/master/lessons/06_SC_SCT_and_integration.md

本教程介紹了來(lái)自Kang et al, 2017的兩組PBMC的比對(duì)。本實(shí)驗(yàn)將PBMCs分為刺激組和對(duì)照組，刺激組給予β干擾素治療。對(duì)干擾素的反應(yīng)導(dǎo)致了細(xì)胞類型特異性基因表達(dá)的變化，這使得對(duì)所有數(shù)據(jù)的聯(lián)合分析變困難了，細(xì)胞聚類既要考慮刺激條件，也要考慮細(xì)胞類型。在這里，我們展示了我們的分析策略，如 Stuart and Butler et al, 2018中所述，用于執(zhí)行整合分析，以促進(jìn)常見細(xì)胞類型的識(shí)別并進(jìn)行比較分析。雖然此示例演示了兩個(gè)數(shù)據(jù)集(條件)的整合，但這些方法可以擴(kuò)展到多個(gè)數(shù)據(jù)集。詳情請(qǐng)參閱提供整合了四個(gè)胰島數(shù)據(jù)集的示例workflow。

整合分析目標(biāo)

下面的教程旨在讓您了解一個(gè)概述：使用Seurat集成過程可以對(duì)復(fù)雜細(xì)胞類型進(jìn)行的各種比較分析。在這里，我們討論三個(gè)主要目標(biāo)：

鑒定兩個(gè)數(shù)據(jù)集中存在的細(xì)胞類型
獲得在對(duì)照細(xì)胞和刺激細(xì)胞中都保守的細(xì)胞類型標(biāo)記
比較數(shù)據(jù)集以找出特定細(xì)胞類型對(duì)刺激的反應(yīng)

工作流程摘要

我們將協(xié)調(diào)SCTransform輸出的Pearson殘差。如下所示，該工作流程由以下步驟組成：

創(chuàng)建要集成的Seurat對(duì)象列表
比較多個(gè)樣本。因此，考慮到庫(kù)大小因素，需要進(jìn)行歸一化
為了更好地比較，還考慮了細(xì)胞周期評(píng)估
對(duì)每個(gè)數(shù)據(jù)集分別執(zhí)行SCTransform
在對(duì)象列表上運(yùn)行PrepSCTIntegration函數(shù)
集成數(shù)據(jù)集，并進(jìn)行聯(lián)合分析

下載Kang et. al. 2017 Seurat原始數(shù)據(jù) (Raw read count)

Kang et. al. 2017
加載必要的R包

library(data.table)
library(ggplot2)
library(Seurat)

options(future.globals.maxSize = 4000 * 1024^2)

pkg <- "ifnb.SeuratData"
if( !is.element(pkg, .packages(all.available = TRUE)) ) {
    install.packages("https://seurat.nygenome.org/src/contrib/ifnb.SeuratData_3.0.0.tar.gz", repos = NULL, type = "source")
}
library(pkg,character.only = TRUE)

#load Kang data
data("ifnb")

預(yù)處理和歸一化

線粒體基因不存在于讀取計(jì)數(shù)矩陣中，因此跳過了由線粒體污染引起的QC。
建議在執(zhí)行sctransform method方法之前檢查細(xì)胞周期時(shí)相。由于細(xì)胞之間的計(jì)數(shù)需要具有可比性，并且每個(gè)細(xì)胞具有不同的UMI總數(shù)，因此我們通過除以每個(gè)細(xì)胞的總計(jì)數(shù)并取自然對(duì)數(shù)進(jìn)行粗略歸一化。這種方法不像我們最終將用來(lái)識(shí)別細(xì)胞簇的sctransform method那樣準(zhǔn)確，但是它足以探索我們數(shù)據(jù)中的變異源。

load('data/cycle.rda')

#split into the original samples
ifnb.list <- SplitObject(ifnb, split.by = "stim")
ifnb.list <- lapply(X = ifnb.list, function(seu) {
    message("This run will take 5+ min ...")
    seu <- NormalizeData(seu, verbose = TRUE) #the normalization result will be stored into .data slot.
    seu <- CellCycleScoring(seu, g2m.features=g2m_genes, s.features=s_genes)
    seu <- SCTransform(seu,verbose = FALSE)
    return(seu)
})

Feature Selection

下一步，整合好數(shù)據(jù)后進(jìn)行features篩選，運(yùn)行PrepSCTIntegration，確保計(jì)算出所有需要的Pearson殘差。

sc.features <- SelectIntegrationFeatures(object.list = ifnb.list)

ifnb.list <- PrepSCTIntegration(object.list = ifnb.list,
                                anchor.features = sc.features,
                                verbose=FALSE)

Perform integration (經(jīng)典的相關(guān)性分析)

整合(Integration)是一種強(qiáng)大的方法，它使用這些最大變異的共享來(lái)源來(lái)識(shí)別跨處理?xiàng)l件或數(shù)據(jù)集的共享子亞類[Stuart and Bulter et al. (2018)]。整合的目標(biāo)是確保一個(gè)條件/數(shù)據(jù)集的細(xì)胞類型與其他條件/數(shù)據(jù)集的相同細(xì)胞類型對(duì)齊(例如，對(duì)照組巨噬細(xì)胞與刺激組的巨噬細(xì)胞對(duì)齊)。

具體地說，該integration方法期望在跨組的單細(xì)胞的至少一個(gè)子集之間進(jìn)行“對(duì)應(yīng)”或“共享”某生物狀態(tài)。integration分析的步驟如下圖所示：

image.png

Fig1. Stuart T and Butler A, et. al. Comprehensive integration of single cell data, bioRxiv 2018

進(jìn)行經(jīng)典的相關(guān)性分析(CCA):

CCA確定條件/組之間的共享變異源。它是主成分分析的一種形式，因?yàn)樗R(shí)別數(shù)據(jù)中最大的變異源，但只有在條件/組之間共享或保守的情況下(使用來(lái)自每個(gè)樣本的3000個(gè)最大變異的基因)。
這一步使用最大的共享變異源粗略地對(duì)齊細(xì)胞。

注：使用共享的高可變基因是因?yàn)樗鼈冏钣锌赡艽砟切﹨^(qū)分不同細(xì)胞類型的基因。

跨數(shù)據(jù)集識(shí)別錨點(diǎn)或相互最近的鄰居(MNN)(有時(shí)會(huì)識(shí)別不正確的錨點(diǎn))：MNN可以被認(rèn)為是“最好的伙伴”(‘best buddies’)。對(duì)于一個(gè)條件下的每個(gè)細(xì)胞：
- 在另一種情況下，細(xì)胞最接近的鄰居是根據(jù)基因表達(dá)值確定的–它是最好的伙伴。
- 執(zhí)行倒數(shù)分析，如果兩個(gè)細(xì)胞在兩個(gè)方向上都是伙伴，則這些細(xì)胞將被標(biāo)記為錨點(diǎn)，以便將兩個(gè)數(shù)據(jù)集“錨定”在一起。
- MNN對(duì)中的細(xì)胞之間的表達(dá)值的差異提供了批次效應(yīng)的估計(jì)，通過對(duì)許多這樣的配對(duì)信息進(jìn)行平均可以使其更加精確。獲得一個(gè)校正向量，并將其應(yīng)用于表達(dá)值以執(zhí)行批次校正。作者聲明： [Stuart and Bulter et al. (2018)]。
過濾錨點(diǎn)以移除不正確的錨點(diǎn)：通過錨點(diǎn)對(duì)在其本地鄰區(qū)的重疊來(lái)評(píng)估錨點(diǎn)對(duì)之間的相似性(不正確的錨點(diǎn)的得分會(huì)很低)-相鄰的細(xì)胞是否有彼此相鄰的最好的伙伴？
Integrate the conditions/datasets:
- 使用錨點(diǎn)和相應(yīng)的分?jǐn)?shù)來(lái)轉(zhuǎn)換細(xì)胞表達(dá)值，從而允許整合數(shù)據(jù)集(不同的樣本、數(shù)據(jù)集、模態(tài))
- 注意：每個(gè)細(xì)胞的轉(zhuǎn)換使用每個(gè)錨點(diǎn)的兩個(gè)細(xì)胞在數(shù)據(jù)集的錨點(diǎn)之間的加權(quán)平均值。權(quán)重由細(xì)胞相似性分?jǐn)?shù)(細(xì)胞與k個(gè)最近錨點(diǎn)之間的距離)和錨點(diǎn)分?jǐn)?shù)確定，因此同一鄰域中的細(xì)胞應(yīng)該具有相似的校正值。
- 如果細(xì)胞類型存在于一個(gè)數(shù)據(jù)集中，但不存在于另一個(gè)數(shù)據(jù)集中，則這些細(xì)胞仍將顯示為單獨(dú)的樣本特定簇?，F(xiàn)在，使用我們的SCTransform對(duì)象作為輸入，讓我們執(zhí)行跨條件的整合。

經(jīng)典的整合方法（CCA integration will take 5+ min）耗時(shí)較久。

immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list,
                                         normalization.method = "SCT",
                                         anchor.features = sc.features,
                                         verbose=FALSE)

immune.combined <- IntegrateData(anchorset = immune.anchors,
                                 normalization.method = "SCT",
                                 verbose=FALSE)
#> Warning: Adding a command log without an assay associated with it

整合后數(shù)據(jù)可視化(Visualization)

對(duì)集成數(shù)據(jù)集進(jìn)行下游分析(即可視化、聚類)。您可以看到，整合后，細(xì)胞按兩種條件分組(對(duì)照組和刺激組)。要顯示的群集注釋來(lái)自我們下載的數(shù)據(jù)。

#Let us delete ifnb.list to free up the memory space
rm(ifnb)
rm(ifnb.list)
rm(immune.anchors)

#Make sure that your default assay is 'integrated'
DefaultAssay(immune.combined) <- "integrated"

immune.combined <- RunPCA(immune.combined, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, dims = 1:20)
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session

# immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:20)
# immune.combined <- FindClusters(immune.combined, resolution = 0.5)

plots <- DimPlot(immune.combined, group.by = c("stim","seurat_annotations"), combine = FALSE)

plots <- lapply(X = plots, FUN = function(x) {
  p <- x + theme(legend.position = "top")
  p <- p + guides(color = guide_legend(nrow = 4, byrow = TRUE, override.aes = list(size = 2.5)))
  })

CombinePlots(plots)
#> Warning: CombinePlots is being deprecated. Plots should now be combined
#> using the patchwork system.

image.png

要并排可視化這兩個(gè)條件，我們可以使用split.by參數(shù)來(lái)顯示按示例著色的每個(gè)條件。


DimPlot(immune.combined, reduction = "umap", split.by = "stim", group.by = "seurat_annotations", label = TRUE) + NoLegend()
#> Warning: Using `as.character()` on a quosure is deprecated as of rlang 0.3.0.
#> Please use `as_label()` or `as_name()` instead.
#> This warning is displayed once per session.

image.png

保存R環(huán)境變量留待下次使用

wkd <- "out"
if (!file.exists(wkd)){dir.create(wkd)}
save(immune.combined, file = file.path(wkd,'01_immune_combined.rd'), compress = TRUE)

本節(jié)重點(diǎn)

Important R function and packages: lapply and ggplot2
CCA

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

10x Genomics PBMC（六）：整合處理和對(duì)照組PBMC數(shù)據(jù)集以學(xué)習(xí)特定細(xì)胞類型篩選

10x Genomics PBMC（六）：整合處理和對(duì)照組PBMC數(shù)據(jù)集以學(xué)習(xí)特定細(xì)胞類型篩選

Integrating stimulated vs. control PBMC datasets to learn cell-type-specific responses

clp

10 June, 2020

Reference

整合分析目標(biāo)

工作流程摘要

下載Kang et. al. 2017 Seurat原始數(shù)據(jù) (Raw read count)

預(yù)處理和歸一化

Feature Selection

Perform integration (經(jīng)典的相關(guān)性分析)

整合后數(shù)據(jù)可視化(Visualization)

本節(jié)重點(diǎn)

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

10x Genomics PBMC（六）：整合處理和對(duì)照組PBMC數(shù)據(jù)集以學(xué)習(xí)特定細(xì)胞類型篩選

Integrating stimulated vs. control PBMC datasets to learn cell-type-specific responses

clp

10 June, 2020

Reference

整合分析目標(biāo)

工作流程摘要

下載Kang et. al. 2017 Seurat原始數(shù)據(jù) (Raw read count)

預(yù)處理和歸一化

Feature Selection

Perform integration (經(jīng)典的相關(guān)性分析)

整合后數(shù)據(jù)可視化(Visualization)

本節(jié)重點(diǎn)

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av