Seurat教程上新||Mixscape : 用多模態(tài)單細(xì)胞數(shù)據(jù)篩選免疫檢查點(diǎn)

pd - l1等抑制性免疫檢查點(diǎn)分子的表達(dá)在人類癌癥中較為常見,可導(dǎo)致T細(xì)胞介導(dǎo)的免疫應(yīng)答的抑制。在這里,我們應(yīng)用ECCITE-seq技術(shù)來探索調(diào)控pd - l1表達(dá)的分子網(wǎng)絡(luò)。ECCITE-seq技術(shù)將混合的CRISPR篩查與單細(xì)胞mRNA和表面蛋白測量相結(jié)合。我們還開發(fā)了一個(gè)計(jì)算框架,mixscape,它通過識別和去除混雜的變異源,大大提高了單細(xì)胞擾動(dòng)屏幕的信噪比。利用這些工具,我們識別和驗(yàn)證PD-L1的調(diào)控因子,并利用我們的多模態(tài)數(shù)據(jù)識別轉(zhuǎn)錄和轉(zhuǎn)錄后的調(diào)控模式。特別是,我們發(fā)現(xiàn)kelch樣蛋白keap1和轉(zhuǎn)錄激活因子NRF2介導(dǎo)了IFN刺激后pd - l1的上調(diào)。我們的結(jié)果為免疫檢查點(diǎn)的調(diào)節(jié)確定了一個(gè)新的機(jī)制,并為分析多模態(tài)單細(xì)胞perturbation screens提供了一個(gè)強(qiáng)大的分析框架 。

免疫檢查點(diǎn)(IC)分子調(diào)節(jié)免疫反應(yīng)中激活和抑制之間的關(guān)鍵平衡。在正常生理?xiàng)l件下,抑制IC分子對于維持自身耐受和防止自身免疫是必不可少的,但在人類癌癥中,抑制IC分子的表達(dá)常常被錯(cuò)誤調(diào)控,以逃避免疫監(jiān)控。例如,抑制性IC PD-L1與T細(xì)胞上的pd -1受體相互作用,抑制T細(xì)胞活化,在許多癌癥中過表達(dá),并作為患者生存和免疫治療反應(yīng)的預(yù)后因素。因此,人們不僅對識別阻斷這些相互作用的治療途徑感興趣,而且對理解癌細(xì)胞上調(diào)PD-L1等ICs的分子網(wǎng)絡(luò)也很感興趣。

之前的研究已經(jīng)初步建立了一套能夠影響PD-L1 mRNA和表面蛋白水平的分子調(diào)控因子。大量研究發(fā)現(xiàn),干擾素(IFN)的暴露可迅速誘導(dǎo)pd - l1在體外和腫瘤微環(huán)境中的表達(dá)[7-10]。因此,IFN反應(yīng)的核心成分是pd - l1表達(dá)的上游調(diào)控因子,包括轉(zhuǎn)錄因子IRF1(直接與pd - l1啟動(dòng)子結(jié)合)、JAK-STATsignal轉(zhuǎn)導(dǎo)通路和IFN受體本身。此外,還鑒定了其他IFN的阻斷信號、pd - l1啟動(dòng)子染色質(zhì)狀態(tài)或?qū)v介導(dǎo)的應(yīng)激的調(diào)節(jié)因子。

此外,最近人們特別關(guān)注pd - l1穩(wěn)定性和降解的轉(zhuǎn)錄后調(diào)節(jié)因子的特性。例如,Cullin 3-SPOPE3-ligase復(fù)合物可以細(xì)胞周期依賴的方式直接泛素化PD-L1,導(dǎo)致其降解。此外,全基因組CRISPR篩選發(fā)現(xiàn)了兩個(gè)之前未鑒定的調(diào)節(jié)因子cmtm6和CMTM4,它們通過防止溶酶體介導(dǎo)的降解來穩(wěn)定PD-L1表面表達(dá)。在這些案例中,pd - l1調(diào)節(jié)劑的干擾被證明可以調(diào)節(jié)抗腫瘤T細(xì)胞的活性,這就突出了理解抑制IC分子調(diào)控的治療興趣。

我們最近引入了擴(kuò)展的crispr兼容的CITE-seq (ECCITE-seq),它同時(shí)測量轉(zhuǎn)錄組、表面蛋白水平和在單細(xì)胞分辨率上的擾動(dòng),作為一種識別和表征分子調(diào)節(jié)劑[18]的新方法。ECCITE-seq建立在混合CRISPR屏幕的實(shí)驗(yàn)設(shè)計(jì)上,在單一實(shí)驗(yàn)中,多個(gè)擾動(dòng)被復(fù)用在一起,但是有明顯的優(yōu)勢。首先,單細(xì)胞測序讀數(shù)(即 Perturb-seq, CROP-seq, CRISP-seq)能夠測量詳細(xì)的分子表型,而不是單個(gè)表型(單個(gè)蛋白的表達(dá)或細(xì)胞活力)。其次,通過同時(shí)耦合測量mRNA、表面蛋白和直接檢測同一細(xì)胞內(nèi)的導(dǎo)聯(lián),ECCITE-seq允許對每個(gè)擾動(dòng)進(jìn)行多模態(tài)表征。因此,我們認(rèn)為ECCITE-seq將使我們能夠同時(shí)測試和識別IC分子的新調(diào)控因子,特別是區(qū)分轉(zhuǎn)錄模式和轉(zhuǎn)錄后模式。此外,豐富和高維的讀出容易促進(jìn)網(wǎng)絡(luò)和基于路徑的分析,這可以超越鑒定單個(gè)基因和產(chǎn)生對其調(diào)控機(jī)制的洞察。

在這里,我們應(yīng)用ECCITE-seq來同時(shí)擾亂和表征假定的調(diào)節(jié)抑制分子對IFN不刺激的反應(yīng)。當(dāng)分析我們的單細(xì)胞數(shù)據(jù)時(shí),我們確定了混雜的異質(zhì)性來源,包括那些接受了靶向引導(dǎo)RNA但沒有表現(xiàn)出干擾效應(yīng)的細(xì)胞,這些細(xì)胞在下游分析中引入了大量的噪聲。我們開發(fā)并驗(yàn)證了計(jì)算方法來控制這些因素,并大幅增加了我們的統(tǒng)計(jì)能力來表征多模態(tài)擾動(dòng)。

利用這些工具,我們確定了一組基因,其擾動(dòng)會影響pd - l1轉(zhuǎn)錄水平、表面蛋白水平,或兩者都影響,并確定了每個(gè)調(diào)控器所使用的潛在分子通路。特別是,我們發(fā)現(xiàn)在人類癌癥中經(jīng)常突變的kelch樣蛋白keap1和轉(zhuǎn)錄激活因子NRF2可以改變pd - l1水平。我們將這些發(fā)現(xiàn)與CUL3的一個(gè)新的調(diào)節(jié)機(jī)制聯(lián)系起來,并表明該基因通過穩(wěn)定NRF2通路作為PD-L1mRNA的間接轉(zhuǎn)錄激活劑。綜上所述,我們的發(fā)現(xiàn)確定了免疫檢查點(diǎn)調(diào)節(jié)的一個(gè)重要途徑,并提出了一個(gè)強(qiáng)大的和廣泛適用的分析框架來分析ECCITE-seq數(shù)據(jù)。

在ECCITE-seq實(shí)驗(yàn)中,我們運(yùn)行了10x Genomics 5’(Chromium Single Cell Immune Profiling Solution v1.0, #1000014, #1000020, #1000151)的8個(gè)lanes ,目標(biāo)是每個(gè)lanes 捕獲10000個(gè)細(xì)胞。在運(yùn)行之前,細(xì)胞存活率被確定和細(xì)胞數(shù)量估計(jì)之前描述。為了跟蹤每個(gè)生物學(xué)重復(fù)身份,樣本按照細(xì)胞hashing protocol 進(jìn)行。

  • mRNA
  • hashtags (Hashtag-derived oligos, HTOs)
  • protein (Antibody-derived oligos, ADTs)
  • gRNA (Guide-derived oligos, GDOs)

文庫均由10x genomics and ECCITE-seq protocols方法構(gòu)建。在NovaSeq運(yùn)行時(shí),所有庫都在兩個(gè)lanes上進(jìn)行測序。利用Cellranger (V2.1.1)將來自mRNA文庫的測序片段映射到hg19參考基因組。為了生成HTO、ADT和GDO庫的計(jì)數(shù)矩陣,使用了CITE-seq-count package (https://github.com/Hoohm/CITE-seq-Count)。然后使用計(jì)數(shù)矩陣作為Seurat R包的輸入來執(zhí)行所有的下游分析。

下面我們跟著官網(wǎng)教程來看看是如何達(dá)到目的的。

首先,我們需要安裝新的Seurat,下載示例數(shù)據(jù):

remotes::install_github(“satijalab/seurat”, ref = “mixscape”)

# Load packages.
library(Seurat)
library(SeuratData)
library(ggplot2)
library(patchwork)
library(scales)
library(dplyr)

# Download dataset using SeuratData.
InstallData(ds = "thp1.eccite")

# Setup custom theme for plotting.
custom_theme <- theme(plot.title = element_text(size = 16, hjust = 0.5), legend.key.size = unit(0.7, 
    "cm"), legend.text = element_text(size = 14))

老規(guī)矩,我們來看一下數(shù)據(jù)格式。

# Load object.
eccite <- LoadData(ds = "thp1.eccite")
eccite
An object of class Seurat 
18776 features across 20729 samples within 4 assays 
Active assay: RNA (18649 features, 0 variable features)
 3 other assays present: ADT, HTO, GDO


我們看到有4個(gè)assay: RNA , ADT, HTO, GDO,一個(gè)Seurat玩轉(zhuǎn)4套數(shù)據(jù),在才叫多模態(tài)啊。我們分別看看這四套數(shù)據(jù)的內(nèi)容:

eccite@assays$RNA@data[1:4,1:4]
4 x 4 sparse Matrix of class "dgCMatrix"
              l1_AAACCTGAGCCAGAAC l1_AAACCTGAGTGGACGT l1_AAACCTGCATGAGCGA l1_AAACCTGTCTTGTCAT
AL627309.1                      .                   .                   .                   .
AP006222.2                      .                   .                   .                   .
RP4-669L17.10                   .                   .                   .                   .
RP11-206L10.3                   .                   .                   .                   .
> eccite@assays$ADT@data[1:4,1:4]
4 x 4 sparse Matrix of class "dgCMatrix"
      l1_AAACCTGAGCCAGAAC l1_AAACCTGAGTGGACGT l1_AAACCTGCATGAGCGA l1_AAACCTGTCTTGTCAT
CD86                  118                 243                  99                 110
PDL1                  522                 139                 109                 334
PDL2                   94                 104                  56                  48
CD366                  67                  59                  80                  47
> eccite@assays$HTO@data[1:4,1:4]
4 x 4 sparse Matrix of class "dgCMatrix"
          l1_AAACCTGAGCCAGAAC l1_AAACCTGAGTGGACGT l1_AAACCTGCATGAGCGA l1_AAACCTGTCTTGTCAT
rep1-tx                    82                  18                  55                  14
rep1-ctrl                   3                   1                   4                   .
rep2-tx                    13                  11                   6                   6
rep2-ctrl                   1                   4                   .                   .
> eccite@assays$GDO@data[1:4,1:4]
4 x 4 sparse Matrix of class "dgCMatrix"
       l1_AAACCTGAGCCAGAAC l1_AAACCTGAGTGGACGT l1_AAACCTGCATGAGCGA l1_AAACCTGTCTTGTCAT
eGFPg1                   1                   1                   1                   1
CUL3g1                   1                   1                   1                   1
CUL3g2                   1                   1                   1                   1
CUL3g3                   1                   1                   1                   1

metadata的內(nèi)容:

 head(eccite@meta.data)
                    orig.ident nCount_RNA nFeature_RNA nCount_HTO nFeature_HTO nCount_GDO nFeature_GDO nCount_ADT nFeature_ADT percent.mito MULTI_ID MULTI_classification
l1_AAACCTGAGCCAGAAC      Lane1      17207         3942         99            4        576          111        801            4     2.295577  rep1-tx              rep1-tx
l1_AAACCTGAGTGGACGT      Lane1       9506         2948         35            5        190          111        545            4     4.512939  rep1-tx              rep1-tx
l1_AAACCTGCATGAGCGA      Lane1      15256         4258         66            4        212          111        344            4     4.116413  rep1-tx              rep1-tx
l1_AAACCTGTCTTGTCAT      Lane1       5135         1780         22            3        243          111        539            4     5.491723  rep1-tx              rep1-tx
l1_AAACGGGAGAACAACT      Lane1       9673         2671         99            5        198          111       1053            4     3.359868  rep1-tx              rep1-tx
l1_AAACGGGAGACAGAGA      Lane1      14941         3918         97            5        124          111        487            4     3.379961  rep1-tx              rep1-tx
                    MULTI_classification.global HTO_classification guide_ID guide_ID.global  gene con      NT    crispr replicate     S.Score   G2M.Score Phase
l1_AAACCTGAGCCAGAAC                     rep1-tx            rep1-tx  STAT2g2         Singlet STAT2  tx STAT2g2 Perturbed      rep1 -0.25271565 -0.77130934    G1
l1_AAACCTGAGTGGACGT                     rep1-tx            rep1-tx   CAV1g4         Singlet  CAV1  tx  CAV1g4 Perturbed      rep1 -0.12380192 -0.33260303    G1
l1_AAACCTGCATGAGCGA                     rep1-tx            rep1-tx  STAT1g2         Singlet STAT1  tx STAT1g2 Perturbed      rep1 -0.15463259 -0.69441836    G1
l1_AAACCTGTCTTGTCAT                     rep1-tx            rep1-tx   CD86g1         Singlet  CD86  tx  CD86g1 Perturbed      rep1 -0.06126198 -0.03781951    G1
l1_AAACGGGAGAACAACT                     rep1-tx            rep1-tx   IRF7g2         Singlet  IRF7  tx  IRF7g2 Perturbed      rep1 -0.13218850 -0.35315597    G1
l1_AAACGGGAGACAGAGA                     rep1-tx            rep1-tx     NTg1         Singlet    NT  tx      NT        NT      rep1 -0.20998403 -0.58247261    G1

禁不住用我們的桑吉圖看看分屬關(guān)系:

library(ggforce)  
eccite@meta.data %>%
  gather_set_data(17:21) %>%
  ggplot(aes(x, id = id, split = y, value = 1))  +
  geom_parallel_sets(aes(fill = NT), show.legend = FALSE, alpha = 0.3) +
  geom_parallel_sets_axes(axis.width = 0.1, color = "lightgrey", fill = "white") +
  geom_parallel_sets_labels(angle = 0) +
  theme_no_axes()

要理解這里的每一列是什么就要學(xué)習(xí)CRISPR的基本知識了。重要的一點(diǎn)是理解perturbation 的概念。

# Normalize protein.
eccite <- NormalizeData(object = eccite, assay = "ADT", normalization.method = "CLR", margin = 2)

為了獲得全局的觀點(diǎn),我們先對RNA的數(shù)據(jù)執(zhí)行Seurat的一般流程:基于rna的聚類是由混雜的變異源驅(qū)動(dòng)的。

# Prepare RNA assay for dimensionality reduction: Normalize data, find variable features and
# scale data.
DefaultAssay(object = eccite) <- "RNA"
eccite <- NormalizeData(object = eccite) %>% FindVariableFeatures() %>% ScaleData()

# Run Principle Component Analysis (PCA) to reduce the dimensionality of the data.
eccite <- RunPCA(object = eccite)

# Run Uniform Manifold Approximation and Projection (UMAP) to visualize clustering in 2-D.
eccite <- RunUMAP(object = eccite, dims = 1:40)

# Generate plots to check if clustering is driven by biological replicate ID, cell cycle phase
# or target gene class.
p1 <- DimPlot(eccite, group.by = "replicate", label = F, pt.size = 0.2, reduction = "umap") + scale_color_brewer(palette = "Dark2") + 
    ggtitle("Biological Replicate") + xlab("UMAP 1") + ylab("UMAP 2") + custom_theme

p2 <- DimPlot(eccite, group.by = "Phase", label = F, pt.size = 0.2, reduction = "umap") + ggtitle("Cell Cycle Phase") + 
    ylab("UMAP 2") + xlab("UMAP 1") + custom_theme

p3 <- DimPlot(eccite, group.by = "crispr", pt.size = 0.2, reduction = "umap", split.by = "crispr", 
    ncol = 1, cols = c("grey39", "goldenrod3")) + ggtitle("Perturbation Status") + ylab("UMAP 2") + 
    xlab("UMAP 1") + custom_theme

# Visualize plots.
((p1/p2 + plot_layout(guides = "auto")) | p3)

接下來,計(jì)算局部擾動(dòng)特征減輕混雜效應(yīng)。

 Calculate local perturbation signature.
eccite <- CalcPerturbSig(object = eccite, assay = "RNA", slot = "data", gd.class = "gene", nt.cell.class = "NT", 
    reduction = "pca", ndims = 40, num.neighbors = 20, split.by = "replicate", new.assay.name = "PRTB")

# Prepare PRTB assay for dimensionality reduction: Normalize data, find variable features and
# center data.
DefaultAssay(object = eccite) <- "PRTB"

# Use variable features from RNA assay.
VariableFeatures(object = eccite) <- VariableFeatures(object = eccite[["RNA"]])
eccite <- ScaleData(eccite, do.scale = F, do.center = T)

# Run PCA to reduce the dimensionality of the data.
eccite <- RunPCA(object = eccite, reduction.key = "prtbpca", reduction.name = "prtbpca")

# Run UMAP to visualize clustering in 2-D.
eccite <- RunUMAP(object = eccite, dims = 1:40, reduction = "prtbpca", reduction.key = "prtbumap", 
    reduction.name = "prtbumap")

# Generate plots to check if clustering is driven by biological replicate ID, cell cycle phase
# or target gene class.
q1 <- DimPlot(eccite, group.by = "replicate", reduction = "prtbumap", pt.size = 0.2) + scale_color_brewer(palette = "Dark2") + 
    ggtitle("Biological Replicate") + ylab("UMAP 2") + xlab("UMAP 1") + custom_theme

q2 <- DimPlot(eccite, group.by = "Phase", reduction = "prtbumap", pt.size = 0.2) + ggtitle("Cell Cycle Phase") + 
    ylab("UMAP 2") + xlab("UMAP 1") + custom_theme

q3 <- DimPlot(eccite, group.by = "crispr", reduction = "prtbumap", split.by = "crispr", ncol = 1, 
    pt.size = 0.2, cols = c("grey39", "goldenrod3")) + ggtitle("Perturbation Status") + ylab("UMAP 2") + 
    xlab("UMAP 1") + custom_theme

# Visualize plots.
(q1/q2 + plot_layout(guides = "auto") | q3)

Mixscape識別沒有檢測擾動(dòng)的細(xì)胞。


#install.packages('mixtools')
# Run mixscape to classify cells based on their perturbation status.
eccite <- RunMixscape(object = eccite, assay = "PRTB", slot = "scale.data", labels = "gene", nt.class.name = "NT", 
    min.de.genes = 5, iter.num = 10, de.assay = "RNA", verbose = F)

# Show that only IFNG pathway KO cells have a reduction in PD-L1 protein expression.
Idents(object = eccite) <- "gene"

VlnPlot(object = eccite, features = "adt_PDL1", idents = c("NT", "JAK2", "STAT1", "IFNGR1", "IFNGR2", 
    "IRF1"), group.by = "gene", pt.size = 0.2, sort = T, split.by = "mixscape_class.global", cols = c("coral3", 
    "grey79", "grey39")) + ggtitle("PD-L1 protein") + theme(axis.text.x = element_text(angle = 0, 
    hjust = 0.5))

KO,NP,NT 是啥意思?

  • If the cell received a NT gRNA, it retains its assignment as non-targeting (NT)
  • If the cell received a targeting gRNA, and is classified in step 8 as NT, it is assigned a non-perturbed (NP) label
  • If the cell received a targeting gRNA, and is classified in step 8 as KO, it receives a perturbed/knock-out (KO) label

用線性判別分析(LDA)可視化擾動(dòng)響應(yīng)。

# Remove non-perturbed cells and run LDA to reduce the dimensionality of the data.
Idents(eccite) <- "mixscape_class.global"
sub <- subset(eccite, idents = c("KO", "NT"))

sub <- MixscapeLDA(object = sub, assay = "RNA", pc.assay = "PRTB", labels = "gene", nt.label = "NT", 
    npcs = 10, logfc.threshold = 0.25, verbose = F)

# Use LDA results to run UMAP and visualize cells on 2-D.
sub <- RunUMAP(sub, dims = 1:11, reduction = "lda", reduction.key = "ldaumap", reduction.name = "ldaumap")

# Visualize UMAP clustering results.
Idents(sub) <- "mixscape_class"
sub$mixscape_class <- as.factor(sub$mixscape_class)
p <- DimPlot(sub, reduction = "ldaumap", label = T, repel = T, label.size = 5)

col = setNames(object = hue_pal()(12), nm = levels(sub$mixscape_class))
names(col) <- c(names(col)[1:7], "NT", names(col)[9:12])
col[8] <- "grey39"

p + scale_color_manual(values = col, drop = FALSE) + ylab("UMAP 2") + xlab("UMAP 1") + custom_theme

代碼和降維圖均來自mixscape官網(wǎng),為什么沒有自己畫,因?yàn)楫嫷揭话惆l(fā)現(xiàn)自己的PC計(jì)算資源不夠用了。對原理感興趣看一看文章:Characterizing the molecular regulation of inhibitory immune checkpoints with multi-modal single-cell screens.

在分析的過程中注意Seurat數(shù)據(jù)的assay之間的切換,這是四套數(shù)據(jù)了。大部分函數(shù)是我們熟悉的Seurat的函數(shù),幾個(gè)關(guān)鍵的函數(shù)需要我們親自查看help文檔,如RunMixscape ``CalcPerturbSig。當(dāng)然,最為關(guān)鍵的還是我們對ECCITE-seq和CRISPR技術(shù)的原理和細(xì)節(jié)。

Mixscape 的出現(xiàn)再次驗(yàn)證了:單細(xì)胞只是概念,我們可以基于此開發(fā)相應(yīng)的技術(shù)。



mixscape
基因篩查進(jìn)入單細(xì)胞時(shí)代
CRISPR-Cas9基因編輯技術(shù)簡介

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容