文章接受，終于有空再發(fā)點(diǎn)筆記了??

SCENIC_Tutorial

SCENIC_Tutorial

http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Running.html

image-20221106111655645.png

Setup

# http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Setup.html
## Required
BiocManager::install(c("RcisTarget"))
BiocManager::install(c("GENIE3")) # Optional. Can be replaced by GRNBoost
## Optional (but highly recommended):
# To score the network on cells (i.e. run AUCell):
BiocManager::install(c("zoo", "mixtools", "rbokeh"), update=F)
# For various visualizations and perform t-SNEs:
BiocManager::install(c("DT", "NMF", "ComplexHeatmap", "R2HTML", "Rtsne"), update=F)
BiocManager::install("doRNG")
devtools::install_github("aertslab/SCENIC") 
devtools::install_github("aertslab/SCopeLoomR")
devtools::install_github("aertslab/AUCell")

#if (!dir.exists("SCENIC_MouseBrain/")) dir.create("SCENIC_MouseBrain/")
if (!dir.exists("int/")) dir.create("int/")
if (!dir.exists("cisTarget_databases/")) dir.create("cisTarget_databases/")

下載相應(yīng)轉(zhuǎn)錄因子數(shù)據(jù)庫，本文用的數(shù)據(jù)來自小鼠，所以下載小鼠的數(shù)據(jù)庫

由于直接從R這里下載的話會(huì)下載不完整，所以這里使用 zsync_curl 下載數(shù)據(jù)庫。

ubuntu下zsync_curl安裝可以參考: https://github.com/AppImage/zsync-curl
相應(yīng)數(shù)據(jù)庫的下載方式：https://resources.aertslab.org/cistarget/help.html

以下為shell命令行

#!/usr/bin/bash
# Specify database name:
# for human 
feather_database_url_hg=("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather" 
"https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather")
# for mouse
feather_database_url_mm=('https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather' 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather')
# for fly
feather_database_url_dm='https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather'
# download
for url in ${feather_database_url_mm[*]}; do
    feather_database="${url##*/}"
    echo "$feather_database"
    zsync_curl "${url}.zsync"
done

下載后，將數(shù)據(jù)庫置于當(dāng)前工作目錄的 cisTarget_databases/

其他的db和相關(guān)注釋文件可從該網(wǎng)址下載https://resources.aertslab.org/cistarget/

Load Packages

library(SCopeLoomR)
library(SCENIC)
library(tidyverse)
library(AUCell)
library(foreach)

Input

Expression matrix

這里使用官方提供的小鼠大腦的單細(xì)胞表達(dá)矩陣（862 genes * 200 cells）進(jìn)行演示.

loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
cellInfo <- get_cell_annotation(loom)
close_loom(loom)

dim(exprMat)

## [1] 862 200

Cell info/phenodata

head(cellInfo)

##                        CellType nGene nUMI
## 1772066100_D04     interneurons   170  509
## 1772063062_G01 oligodendrocytes   152  443
## 1772060224_F07        microglia   218  737
## 1772071035_G09    pyramidal CA1   265 1068
## 1772067066_E12 oligodendrocytes    81  273
## 1772066100_B01    pyramidal CA1   108  191

Numbers of different celltypes

table(cellInfo$CellType)

## 
## astrocytes_ependymal         interneurons            microglia 
##                   29                   43                   14 
##     oligodendrocytes        pyramidal CA1 
##                   55                   59

save cell information

saveRDS(cellInfo, file="int/cellInfo.Rds")

設(shè)置每個(gè)celltype的顏色

# Color to assign to the variables (same format as for NMF::aheatmap)
colVars <- list(CellType=c("microglia"="forestgreen", 
                           "endothelial-mural"="darkorange", 
                           "astrocytes_ependymal"="magenta4", 
                           "oligodendrocytes"="hotpink", 
                           "interneurons"="red3", 
                           "pyramidal CA1"="skyblue", 
                           "pyramidal SS"="darkblue"))
colVars$CellType <- colVars$CellType[intersect(names(colVars$CellType), cellInfo$CellType)]
saveRDS(colVars, file="int/colVars.Rds")
plot.new(); legend(0,1, fill=colVars$CellType, legend=names(colVars$CellType))

image-20221106111730508.png

Initialize SCENIC settings

在開始分析前，我們先初始化SCENIC分析的一些參數(shù)

org <- "mgi" # or hgnc, or dmel
dbDir <- "cisTarget_databases" # RcisTarget databases location
myDatasetTitle <- "SCENIC example on Mouse brain" # choose a name for your analysis
data(defaultDbNames)
dbs <- defaultDbNames[[org]]
scenicOptions <- initializeScenic(org=org, 
                                  dbDir=dbDir, 
                                  dbs=dbs['10kb'], 
                                  datasetTitle=myDatasetTitle, 
                                  nCores=6) 

# Save to use at a later time...
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

為了演示，我們只分析10kb的數(shù)據(jù)庫

Co-expression network

SCENIC流程的第一步是基于表達(dá)數(shù)據(jù)推斷潛在的轉(zhuǎn)錄因子靶標(biāo)。這可以通過GENIE3或GRNBoost完成

這里我們需要輸入的是：

過濾后的單細(xì)胞表達(dá)矩陣
TF的列表（潛在的調(diào)控因子）

這一步完成后將會(huì)輸出一個(gè)相關(guān)性矩陣，以生成基因共表達(dá)模塊

Gene filter/selection

運(yùn)行GENIE3/GRNBoost之前需要先對(duì)表達(dá)矩陣進(jìn)行過濾，過濾的條件包括：

按每個(gè)基因的所有read counts過濾，以排除極低reads的測序噪聲；
按檢測到基因的細(xì)胞數(shù)過濾；
只保留在RcisTarget數(shù)據(jù)庫中的基因進(jìn)行分析。

# (Adjust minimum values according to your dataset)
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions,
                           minCountsPerGene=3*.01*ncol(exprMat),
                           minSamples=ncol(exprMat)*.01)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0    17.0    48.0   146.8   137.0  5868.0 
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   10.00   25.00   39.01   60.00  180.00

需要注意的是過濾的條件需要根據(jù)數(shù)據(jù)集進(jìn)行相應(yīng)調(diào)整。

應(yīng)用以上條件過濾，最終過濾后的表達(dá)矩陣為: 770 genes * 200 cells

exprMat_filtered <- exprMat[genesKept, ]
dim(exprMat_filtered)

## [1] 770 200

Correlation

計(jì)算TF和其靶標(biāo)的Spearman correlation。這一步會(huì)在 int/ 文件夾下生成 corrMat.Rds

runCorrelation(exprMat_filtered, scenicOptions)
list.files("int/", pattern="corrMat")

## [1] "1.2_corrMat.Rds"

GENIE3

GENIE3是一種基于隨機(jī)森林的算法，每次運(yùn)行的結(jié)果都有些許差別，可以通過 set.seed 設(shè)置固定的種子以獲得可重復(fù)的結(jié)果。

GENIE3使用轉(zhuǎn)錄因子的表達(dá)量作為變量預(yù)測每個(gè)靶基因的表達(dá)量，最后把訓(xùn)練輸出模型的權(quán)重作為轉(zhuǎn)錄因子-靶基因的調(diào)控強(qiáng)度。

# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1) 

set.seed(1001)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)

Build and score the GRN

GENIE3/GRNBoost 完成后，就可以使用SCENIC推斷基因調(diào)控網(wǎng)路（Gene Regulatory Network, GRN）.

SCENIC的流程包括：

獲取基因共表達(dá)模塊
獲取調(diào)控子 (with RcisTarget)
對(duì)每個(gè)細(xì)胞的GRN進(jìn)行打分 (with AUCell)
根據(jù)GRN的活性對(duì)細(xì)胞進(jìn)行聚類

loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
close_loom(loom)
# Optional: log expression (for TF expression plot, it does not affect any other calculation)
exprMat_log <- log2(exprMat+1)
dim(exprMat)

## [1] 862 200

運(yùn)行SCENIC

runSCENIC_1_coexNetwork2modules: 將GENIE3/GRNBoost的結(jié)果轉(zhuǎn)換為共表達(dá)模塊
runSCENIC_2_createRegulons: 調(diào)用RcisTarget，保留通過TF-motif富集分析后的共表達(dá)模塊
runSCENIC_3_scoreCells: 調(diào)用AUCell在每個(gè)細(xì)胞中對(duì)regulon進(jìn)行打分，分?jǐn)?shù)作為regulon在該細(xì)胞的活性

scenicOptions <- runSCENIC_1_coexNetwork2modules(scenicOptions)

##            [,1]
## nTFs          8
## nTargets    770
## nGeneSets    47
## nLinks    17308

scenicOptions <- runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) #** Only for toy run!!

##               [,1]
## top5perTarget    8
## [1] 22058

scenicOptions@settings$nCores <- 1 # for runSCENIC_3_scoreCells
scenicOptions <- runSCENIC_3_scoreCells(scenicOptions, exprMat_log)

##    min     1%     5%    10%    50%   100% 
##  46.00  58.99  77.00  84.80 154.00 342.00

saveRDS(scenicOptions, file="int/scenicOptions.Rds") # To save status

上述每一步都會(huì)在int/中生成若干文件。

runSCENIC_1_coexNetwork2modules 計(jì)算了TF與每個(gè)基因的相關(guān)性權(quán)重，為了避免in-direct TF-target pairs的干擾，作者使用了幾種過濾方法的組合來形成共表達(dá)基因模塊（Modules）

方法含義

w0.001 以TF為單位，保留weight > 0.01的基因形成modules

w0.005 以TF為單位，保留weight > 0.05的基因形成modules

top50 保留每個(gè)TF中weight top 50的基因

top5perTarget 以基因?yàn)閱挝?，保留weight top 5的TF得到簡化的TF-target pairs list，然后把基因分配給TF構(gòu)建共表達(dá)模塊

top10perTarget 以基因?yàn)閱挝唬Ａ魒eight top 10的TF得到簡化的TF-target pairs list，然后把基因分配給TF構(gòu)建共表達(dá)模塊

top50perTarget 以基因?yàn)閱挝?，保留weight top 50的TF得到簡化的TF-target pairs list，然后把基因分配給TF構(gòu)建共表達(dá)模塊

top1sd 保留每個(gè)TF中，weight大于 mean(weight) + sd(weight)的targets

top3sd 保留每個(gè)TF中，weight大于 mean(weight) + 3*sd(weight)的targets

runSCENIC_3_scoreCells使用AUCell計(jì)算每個(gè)regulon的活性

AUCell calculates the enrichment of the regulon as an area under the recovery curve (AUC) across the ranking of all genes in a particular cell, whereby genes are ranked by their expression value.

In brief, the scoring method is based on a recovery analysis where the x-axis (Supplementary Fig. 1c) is the ranking of all genes based on expression level (genes with the same expression value, e.g., '0', are randomly sorted); and the y-axis is the number of genes recovered from the input set.

AUCell then uses the AUC to calculate whether a critical subset of the input gene set is enriched at the top of the ranking for each cell. In this way, the AUC represents the proportion of expressed genes in the signature and their relative expression values compared to the other genes within the cell.

get regulon activity matrix

runSCENIC_3_scoreCells將regulon打分的結(jié)果保存到'int/3.4_regulonAUC.Rds'中，我們可以讀入這個(gè)數(shù)據(jù)進(jìn)行分析。

regulonAUC <- readRDS('int/3.4_regulonAUC.Rds')
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]

regAct <- getAUC(regulonAUC)
dim(regAct);regAct[,1:3]

## [1]   7 200

##              cells
## gene sets     1772066100_D04 1772063062_G01 1772060224_F07
##   Tef (405g)     0.595558153     0.46756283    0.275862069
##   Dlx5 (35g)     0.097560976     0.00000000    0.009059233
##   Irf1 (105g)    0.000000000     0.05610754    0.382817066
##   Stat6 (27g)    0.000000000     0.01399177    0.146502058
##   Sox9 (150g)    0.210987726     0.22676797    0.049678551
##   Sox10 (88g)    0.003506721     0.31092928    0.023962595
##   Sox8 (98g)     0.042080655     0.26241964    0.026300409

這里我們獲得了7個(gè)regulon在200個(gè)細(xì)胞中的打分矩陣，其中每一行是一個(gè)regulon，每一列是一個(gè)細(xì)胞。以"Tef (405g)"為例，該regulon id表示的意思是轉(zhuǎn)錄因子Tef這個(gè)模塊當(dāng)前調(diào)控405個(gè)潛在的靶基因。

Optional steps

在獲取調(diào)控模塊（regulons）X 細(xì)胞（cells）的AUC矩陣后，SCENIC的主要分析已經(jīng)完成了，后續(xù)可以根據(jù)個(gè)人需求進(jìn)一步分析。例如以下的這些分析。

二值化網(wǎng)絡(luò)活性(regulon on/off)

這一步可以手動(dòng)查看每一個(gè)regulon的binarized threshold是否合理，AUCell自動(dòng)判斷的閾值有時(shí)候會(huì)過于嚴(yán)格，以致某一regulon在所有細(xì)胞都被判斷為off的狀態(tài)。

aucellApp <- plotTsne_AUCellApp(scenicOptions, exprMat_log)
savedSelections <- shiny::runApp(aucellApp)

image-20221106111821002.png

這一步會(huì)創(chuàng)建一個(gè)shinyapp，以探索定義基因模塊激活的閾值。

# Save the modified thresholds:
newThresholds <- savedSelections$thresholds
saveRDS(newThresholds, file=getIntName(scenicOptions, "aucell_thresholds"))
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

二值化AUC

scenicOptions <- runSCENIC_4_aucell_binarize(scenicOptions)

基于調(diào)控活性的聚類與降維

我們可以根據(jù)調(diào)控模塊的活性對(duì)細(xì)胞進(jìn)行聚類。

nPcs <- c(5)
# Run t-SNE with different settings:
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50))
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50), onlyHighConf=TRUE, filePrefix="int/tSNE_oHC")
# Plot as pdf (individual files in int/):
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_", list.files("int"), value=T), value=T))

par(mfrow=c(length(nPcs), 3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)

image-20221106111844875.png

image-20221106111853549.png

保存tSNE的參數(shù)

scenicOptions@settings$defaultTsne$aucType <- "AUC"
scenicOptions@settings$defaultTsne$dims <- 5
scenicOptions@settings$defaultTsne$perpl <- 15
saveRDS(scenicOptions, file="int/scenicOptions.Rds")

Exploring/interpreting the results

SCENIC 分析的結(jié)果在當(dāng)前工作目錄下的 output/ 中。

將AUC和TF的表達(dá)投影到tSNE

tSNE_scenic <- readRDS(tsneFileName(scenicOptions))
aucell_regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
# Show TF expression:
par(mfrow=c(2,3))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, exprMat, aucell_regulonAUC[onlyNonDuplicatedExtended(rownames(aucell_regulonAUC))[c("Dlx5", "Sox10", "Sox9","Irf1", "Stat6")],], plots="Expression")

image-20221106111901760.png

# Save AUC as PDF:
Cairo::CairoPDF("output/Step4_BinaryRegulonActivity_tSNE_colByAUC.pdf", width=20, height=15)
par(mfrow=c(4,6))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, cellsAUC=aucell_regulonAUC, plots="AUC")
dev.off()

## png 
##   2

以密度圖的方式展示tSNE

library(KernSmooth)
library(RColorBrewer)
dens2d <- bkde2D(tSNE_scenic$Y, 1)$fhat
image(dens2d, col=brewer.pal(9, "YlOrBr"), axes=FALSE)
contour(dens2d, add=TRUE, nlevels=5, drawlabels=FALSE)

image-20221106111909826.png

展示多個(gè)調(diào)控子

這里注意原教程寫于2020，用的是 SCENIC_1.2.0 。到了2022年，SCENIC已經(jīng)更新到1.3.1.,而 plotTsne_rgb 已經(jīng)被替換為 plotEmb_rgb

#par(bg = "black")
par(mfrow=c(1,2))

regulonNames <- c( "Dlx5","Sox10")
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="AUC", aucMaxContrast=0.6)

regulonNames <- list(red=c("Sox10", "Sox8"),
                     green=c("Irf1"),
                     blue=c( "Tef"))
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="Binary")

image-20221106111917525.png

GRN: Regulon targets and motifs

列出當(dāng)前基因調(diào)控網(wǎng)路中某個(gè)調(diào)控模塊下的基因:

regulons <- loadInt(scenicOptions, "regulons")
regulons[c("Dlx5", "Irf1")]

## $Dlx5
##  [1] "2610203C20Rik" "Adamts17"      "AI854703"      "Arhgef10l"    
##  [5] "Bahcc1"        "Cirbp"         "Cplx1"         "Dlx1"         
##  [9] "Gad1"          "Gadd45gip1"    "Hexim2"        "Igf1"         
## [13] "Iglon5"        "Ltbp3"         "Myt1"          "Npas1"        
## [17] "Nxph1"         "Peli2"         "Plekha6"       "Prkab1"       
## [21] "Ptchd2"        "Racgap1"       "Rgs8"          "Robo1"        
## [25] "Rpl34"         "Sema3c"        "Shisa9"        "Slc12a5"      
## [29] "Slc39a6"       "Spns2"         "Stox2"         "Syt6"         
## [33] "Unc5d"         "Wnt5a"        
## 
## $Irf1
##   [1] "4930523C07Rik" "Acsl5"         "Adrb1"         "Ahdc1"        
##   [5] "Ahnak"         "Akap1"         "Anxa2"         "Arhgap31"     
##   [9] "Arhgef10l"     "Arhgef19"      "Atg14"         "B2m"          
##  [13] "Bank1"         "Bcl2a1b"       "Cald1"         "Capsl"        
##  [17] "Ccl2"          "Ccl3"          "Ccl7"          "Ccnd1"        
##  [21] "Ccnd3"         "Cd163"         "Cd48"          "Cited2"       
##  [25] "Cmtm6"         "Ctgf"          "Ctnnd1"        "Ctss"         
##  [29] "Ddr2"          "E130114P18Rik" "Egfr"          "Ehd1"         
##  [33] "Esam"          "Fcer1g"        "Fcgr1"         "Fgf14"        
##  [37] "Fstl4"         "Fyb"           "Gabrb1"        "Gadd45g"      
##  [41] "Gja1"          "Gpr34"         "Gramd3"        "H2-K1"        
##  [45] "Hfe"           "Il6ra"         "Irf1"          "Itgb5"        
##  [49] "Kcne4"         "Kcnj16"        "Kcnj2"         "Laptm5"       
##  [53] "Lhfp"          "Lrp4"          "Map3k5"        "Mdk"          
##  [57] "Med13"         "Midn"          "Mr1"           "Ms4a7"        
##  [61] "Msr1"          "Myh9"          "Nin"           "Nptx1"        
##  [65] "Osbpl6"        "P2ry13"        "Parp12"        "Peli2"        
##  [69] "Plcl2"         "Plekha6"       "Prkab1"        "Rab3il1"      
##  [73] "Rgs5"          "Rhobtb2"       "Rnase4"        "Sepp1"        
##  [77] "Sertad2"       "Sgk3"          "Slamf9"        "Slc4a4"       
##  [81] "Slc7a7"        "Slco5a1"       "Slitrk2"       "Soat1"        
##  [85] "Srgn"          "St3gal6"       "St5"           "St8sia4"      
##  [89] "Stard8"        "Stat6"         "Stk17b"        "Tapbp"        
##  [93] "Tbxas1"        "Tgfa"          "Tgfb3"         "Tmem100"      
##  [97] "Tnfaip3"       "Tnfaip8"       "Trib1"         "Trpm7"        
## [101] "Txnip"         "Usp2"          "Vgll4"         "Vps37b"       
## [105] "Zfp36l2"

一個(gè)regulon可以對(duì)應(yīng)多個(gè)靶標(biāo)，而AUC則是在靶標(biāo)大于10個(gè)的regulons中進(jìn)行運(yùn)算。

regulons <- loadInt(scenicOptions, "aucell_regulons")
head(cbind(onlyNonDuplicatedExtended(names(regulons))))

##       [,1]         
## Tef   "Tef (405g)" 
## Dlx5  "Dlx5 (35g)" 
## Sox9  "Sox9 (150g)"
## Sox8  "Sox8 (98g)" 
## Sox10 "Sox10 (88g)"
## Irf1  "Irf1 (105g)"

TF-target詳細(xì)的對(duì)應(yīng)關(guān)系在 output/Step2_regulonTargetsInfo.tsv 中

regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
head(tableSubset)

##       TF          gene highConfAnnot nMotifs         bestMotif  NES motifDb
## 1: Stat6 4930523C07Rik          TRUE      14 tfdimers__MD00051 3.39    10kb
## 2: Stat6       Bcl2a1b          TRUE      11 tfdimers__MD00051 3.39    10kb
## 3: Stat6          Ccl3          TRUE       8 tfdimers__MD00051 3.39    10kb
## 4: Stat6          Cd14          TRUE      14 tfdimers__MD00051 3.39    10kb
## 5: Stat6       Clec4a2          TRUE       1 tfdimers__MD00051 3.39    10kb
## 6: Stat6       Col27a1          TRUE       9 tfdimers__MD00051 3.39    10kb
##       coexModule   spearCor CoexWeight
## 1: top5perTarget 0.17259814 0.06908293
## 2: top5perTarget 0.08241394 0.15894490
## 3: top5perTarget 0.13977515 0.07461879
## 4: top5perTarget 0.07996444 0.15006861
## 5: top5perTarget 0.10529877 0.24615271
## 6: top5perTarget 0.17040821 0.18443827

Regulators for known cell types or clusters

每個(gè)cluster的平均Regulon活性

regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regulonActivity_byCellType <- sapply(split(rownames(cellInfo), cellInfo$CellType),
                                     function(cells) rowMeans(getAUC(regulonAUC)[,cells]))
regulonActivity_byCellType_Scaled <- t(scale(t(regulonActivity_byCellType), center = T, scale=T))

ComplexHeatmap::Heatmap(regulonActivity_byCellType_Scaled, name="Regulon activity")

image-20221106111929359.png

二值化版本（具有調(diào)節(jié)子活性的細(xì)胞類型/簇的細(xì)胞百分比）

minPerc <- .7
binaryRegulonActivity <- loadInt(scenicOptions, "aucell_binary_nonDupl")
cellInfo_binarizedCells <- cellInfo[which(rownames(cellInfo)%in% colnames(binaryRegulonActivity)),, drop=FALSE]
regulonActivity_byCellType_Binarized <- sapply(split(rownames(cellInfo_binarizedCells), cellInfo_binarizedCells$CellType), 
                                               function(cells) rowMeans(binaryRegulonActivity[,cells, drop=FALSE]))
binaryActPerc_subset <- regulonActivity_byCellType_Binarized[which(rowSums(regulonActivity_byCellType_Binarized>minPerc)>0),]
ComplexHeatmap::Heatmap(binaryActPerc_subset, name="Regulon activity (%)", col = c("white","pink","red"))

image-20221106111937144.png

細(xì)胞特異性調(diào)控因子

我們還可以計(jì)算每個(gè)regulon在細(xì)胞中的特異性系數(shù)（Regulon Specificity Score, RSS）來衡量regulon在不同細(xì)胞類型間的特異程度。

RSS = 1-sqrt(JSD)

JSD: Jensen-Shannon Divergence

In probability theory and statistics, the Jensen--Shannon divergence is a method of measuring the similarity between two probability distributions.

https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence

# regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
rss <- calcRSS(AUC=getAUC(regulonAUC), cellAnnotation=cellInfo[colnames(regulonAUC), "CellType"])
rssPlot <- plotRSS(rss)
print(rssPlot$plot)

image-20221106111946554.png

以上就是SECENIC的分析流程，實(shí)際上計(jì)算AUC矩陣后就結(jié)束了，后續(xù)的分析都是根據(jù)課題需求進(jìn)行的各種探索與可視化。

ref:

https://scenic.aertslab.org/tutorials/

http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Running.html

https://github.com/aertslab/SCENIC/blob/master/R/runSCENIC_1_coexNetwork2modules.R

https://cloud.tencent.com/developer/article/1692240

http://www.itdecent.cn/p/093a835ab056

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

R-SCENIC 單細(xì)胞轉(zhuǎn)錄因子調(diào)控網(wǎng)絡(luò)分析

R-SCENIC 單細(xì)胞轉(zhuǎn)錄因子調(diào)控網(wǎng)絡(luò)分析

SCENIC_Tutorial

Setup

Load Packages

Input

Expression matrix

Cell info/phenodata

Initialize SCENIC settings

Co-expression network

Gene filter/selection

Correlation

GENIE3

Build and score the GRN

get regulon activity matrix

Optional steps

二值化網(wǎng)絡(luò)活性(regulon on/off)

基于調(diào)控活性的聚類與降維

Exploring/interpreting the results

將AUC和TF的表達(dá)投影到tSNE

GRN: Regulon targets and motifs

Regulators for known cell types or clusters

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

R-SCENIC 單細(xì)胞轉(zhuǎn)錄因子調(diào)控網(wǎng)絡(luò)分析

SCENIC_Tutorial

Setup

Load Packages

Input

Expression matrix

Cell info/phenodata

Initialize SCENIC settings

Co-expression network

Gene filter/selection

Correlation

GENIE3

Build and score the GRN

get regulon activity matrix

Optional steps

二值化網(wǎng)絡(luò)活性(regulon on/off)

基于調(diào)控活性的聚類與降維

Exploring/interpreting the results

將AUC和TF的表達(dá)投影到tSNE

GRN: Regulon targets and motifs

Regulators for known cell types or clusters

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av