文章接受,終于有空再發(fā)點(diǎn)筆記了??
SCENIC_Tutorial

Setup
# http://htmlpreview.github.io/?https://github.com/aertslab/SCENIC/blob/master/inst/doc/SCENIC_Setup.html
## Required
BiocManager::install(c("RcisTarget"))
BiocManager::install(c("GENIE3")) # Optional. Can be replaced by GRNBoost
## Optional (but highly recommended):
# To score the network on cells (i.e. run AUCell):
BiocManager::install(c("zoo", "mixtools", "rbokeh"), update=F)
# For various visualizations and perform t-SNEs:
BiocManager::install(c("DT", "NMF", "ComplexHeatmap", "R2HTML", "Rtsne"), update=F)
BiocManager::install("doRNG")
devtools::install_github("aertslab/SCENIC")
devtools::install_github("aertslab/SCopeLoomR")
devtools::install_github("aertslab/AUCell")
#if (!dir.exists("SCENIC_MouseBrain/")) dir.create("SCENIC_MouseBrain/")
if (!dir.exists("int/")) dir.create("int/")
if (!dir.exists("cisTarget_databases/")) dir.create("cisTarget_databases/")
下載相應(yīng)轉(zhuǎn)錄因子數(shù)據(jù)庫,本文用的數(shù)據(jù)來自小鼠,所以下載小鼠的數(shù)據(jù)庫
由于直接從R這里下載的話會(huì)下載不完整,所以這里使用 zsync_curl 下載數(shù)據(jù)庫。
ubuntu下zsync_curl安裝可以參考: https://github.com/AppImage/zsync-curl
相應(yīng)數(shù)據(jù)庫的下載方式:https://resources.aertslab.org/cistarget/help.html
以下為shell命令行
#!/usr/bin/bash
# Specify database name:
# for human
feather_database_url_hg=("https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-500bp-upstream-7species.mc9nr.feather"
"https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-7species.mc9nr.feather")
# for mouse
feather_database_url_mm=('https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-500bp-upstream-7species.mc9nr.feather' 'https://resources.aertslab.org/cistarget/databases/mus_musculus/mm9/refseq_r45/mc9nr/gene_based/mm9-tss-centered-10kb-7species.mc9nr.feather')
# for fly
feather_database_url_dm='https://resources.aertslab.org/cistarget/databases/drosophila_melanogaster/dm6/flybase_r6.02/mc8nr/gene_based/dm6-5kb-upstream-full-tx-11species.mc8nr.feather'
# download
for url in ${feather_database_url_mm[*]}; do
feather_database="${url##*/}"
echo "$feather_database"
zsync_curl "${url}.zsync"
done
下載后,將數(shù)據(jù)庫置于當(dāng)前工作目錄的 cisTarget_databases/
其他的db和相關(guān)注釋文件可從該網(wǎng)址下載https://resources.aertslab.org/cistarget/
Load Packages
library(SCopeLoomR)
library(SCENIC)
library(tidyverse)
library(AUCell)
library(foreach)
Input
Expression matrix
這里使用官方提供的小鼠大腦的單細(xì)胞表達(dá)矩陣(862 genes * 200 cells)進(jìn)行演示.
loomPath <- system.file(package="SCENIC", "examples/mouseBrain_toy.loom")
loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
cellInfo <- get_cell_annotation(loom)
close_loom(loom)
dim(exprMat)
## [1] 862 200
Cell info/phenodata
head(cellInfo)
## CellType nGene nUMI
## 1772066100_D04 interneurons 170 509
## 1772063062_G01 oligodendrocytes 152 443
## 1772060224_F07 microglia 218 737
## 1772071035_G09 pyramidal CA1 265 1068
## 1772067066_E12 oligodendrocytes 81 273
## 1772066100_B01 pyramidal CA1 108 191
Numbers of different celltypes
table(cellInfo$CellType)
##
## astrocytes_ependymal interneurons microglia
## 29 43 14
## oligodendrocytes pyramidal CA1
## 55 59
save cell information
saveRDS(cellInfo, file="int/cellInfo.Rds")
設(shè)置每個(gè)celltype的顏色
# Color to assign to the variables (same format as for NMF::aheatmap)
colVars <- list(CellType=c("microglia"="forestgreen",
"endothelial-mural"="darkorange",
"astrocytes_ependymal"="magenta4",
"oligodendrocytes"="hotpink",
"interneurons"="red3",
"pyramidal CA1"="skyblue",
"pyramidal SS"="darkblue"))
colVars$CellType <- colVars$CellType[intersect(names(colVars$CellType), cellInfo$CellType)]
saveRDS(colVars, file="int/colVars.Rds")
plot.new(); legend(0,1, fill=colVars$CellType, legend=names(colVars$CellType))

Initialize SCENIC settings
在開始分析前,我們先初始化SCENIC分析的一些參數(shù)
org <- "mgi" # or hgnc, or dmel
dbDir <- "cisTarget_databases" # RcisTarget databases location
myDatasetTitle <- "SCENIC example on Mouse brain" # choose a name for your analysis
data(defaultDbNames)
dbs <- defaultDbNames[[org]]
scenicOptions <- initializeScenic(org=org,
dbDir=dbDir,
dbs=dbs['10kb'],
datasetTitle=myDatasetTitle,
nCores=6)
# Save to use at a later time...
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
為了演示,我們只分析10kb的數(shù)據(jù)庫
Co-expression network
SCENIC流程的第一步是基于表達(dá)數(shù)據(jù)推斷潛在的轉(zhuǎn)錄因子靶標(biāo)。這可以通過GENIE3或GRNBoost完成
這里我們需要輸入的是:
- 過濾后的單細(xì)胞表達(dá)矩陣
- TF的列表(潛在的調(diào)控因子)
這一步完成后將會(huì)輸出一個(gè)相關(guān)性矩陣,以生成基因共表達(dá)模塊
Gene filter/selection
運(yùn)行GENIE3/GRNBoost之前需要先對(duì)表達(dá)矩陣進(jìn)行過濾,過濾的條件包括:
- 按每個(gè)基因的所有read counts過濾,以排除極低reads的測序噪聲;
- 按檢測到基因的細(xì)胞數(shù)過濾;
- 只保留在RcisTarget數(shù)據(jù)庫中的基因進(jìn)行分析。
# (Adjust minimum values according to your dataset)
genesKept <- geneFiltering(exprMat, scenicOptions=scenicOptions,
minCountsPerGene=3*.01*ncol(exprMat),
minSamples=ncol(exprMat)*.01)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0 17.0 48.0 146.8 137.0 5868.0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 10.00 25.00 39.01 60.00 180.00
需要注意的是過濾的條件需要根據(jù)數(shù)據(jù)集進(jìn)行相應(yīng)調(diào)整。
應(yīng)用以上條件過濾,最終過濾后的表達(dá)矩陣為: 770 genes * 200 cells
exprMat_filtered <- exprMat[genesKept, ]
dim(exprMat_filtered)
## [1] 770 200
Correlation
計(jì)算TF和其靶標(biāo)的Spearman correlation。這一步會(huì)在 int/ 文件夾下生成 corrMat.Rds
runCorrelation(exprMat_filtered, scenicOptions)
list.files("int/", pattern="corrMat")
## [1] "1.2_corrMat.Rds"
GENIE3
GENIE3是一種基于隨機(jī)森林的算法,每次運(yùn)行的結(jié)果都有些許差別,可以通過 set.seed 設(shè)置固定的種子以獲得可重復(fù)的結(jié)果。
GENIE3使用轉(zhuǎn)錄因子的表達(dá)量作為變量預(yù)測每個(gè)靶基因的表達(dá)量,最后把訓(xùn)練輸出模型的權(quán)重作為轉(zhuǎn)錄因子-靶基因的調(diào)控強(qiáng)度。
# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1)
set.seed(1001)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)
Build and score the GRN
GENIE3/GRNBoost 完成后,就可以使用SCENIC推斷基因調(diào)控網(wǎng)路(Gene Regulatory Network, GRN).
SCENIC的流程包括:
- 獲取基因共表達(dá)模塊
- 獲取調(diào)控子 (with RcisTarget)
- 對(duì)每個(gè)細(xì)胞的GRN進(jìn)行打分 (with AUCell)
- 根據(jù)GRN的活性對(duì)細(xì)胞進(jìn)行聚類
loom <- open_loom(loomPath)
exprMat <- get_dgem(loom)
close_loom(loom)
# Optional: log expression (for TF expression plot, it does not affect any other calculation)
exprMat_log <- log2(exprMat+1)
dim(exprMat)
## [1] 862 200
運(yùn)行SCENIC
-
runSCENIC_1_coexNetwork2modules: 將GENIE3/GRNBoost的結(jié)果轉(zhuǎn)換為共表達(dá)模塊 -
runSCENIC_2_createRegulons: 調(diào)用RcisTarget,保留通過TF-motif富集分析后的共表達(dá)模塊 -
runSCENIC_3_scoreCells: 調(diào)用AUCell在每個(gè)細(xì)胞中對(duì)regulon進(jìn)行打分,分?jǐn)?shù)作為regulon在該細(xì)胞的活性
scenicOptions <- runSCENIC_1_coexNetwork2modules(scenicOptions)
## [,1]
## nTFs 8
## nTargets 770
## nGeneSets 47
## nLinks 17308
scenicOptions <- runSCENIC_2_createRegulons(scenicOptions, coexMethod=c("top5perTarget")) #** Only for toy run!!
## [,1]
## top5perTarget 8
## [1] 22058
scenicOptions@settings$nCores <- 1 # for runSCENIC_3_scoreCells
scenicOptions <- runSCENIC_3_scoreCells(scenicOptions, exprMat_log)
## min 1% 5% 10% 50% 100%
## 46.00 58.99 77.00 84.80 154.00 342.00
saveRDS(scenicOptions, file="int/scenicOptions.Rds") # To save status
上述每一步都會(huì)在int/中生成若干文件。
runSCENIC_1_coexNetwork2modules 計(jì)算了TF與每個(gè)基因的相關(guān)性權(quán)重,為了避免in-direct TF-target pairs的干擾,作者使用了幾種過濾方法的組合來形成共表達(dá)基因模塊(Modules)
方法 含義
w0.001 以TF為單位,保留weight > 0.01的基因形成modules
w0.005 以TF為單位,保留weight > 0.05的基因形成modules
top50 保留每個(gè)TF中weight top 50的基因
top5perTarget 以基因?yàn)閱挝?,保留weight top 5的TF得到簡化的TF-target pairs list,然后把基因分配給TF構(gòu)建共表達(dá)模塊
top10perTarget 以基因?yàn)閱挝唬A魒eight top 10的TF得到簡化的TF-target pairs list,然后把基因分配給TF構(gòu)建共表達(dá)模塊
top50perTarget 以基因?yàn)閱挝?,保留weight top 50的TF得到簡化的TF-target pairs list,然后把基因分配給TF構(gòu)建共表達(dá)模塊
top1sd 保留每個(gè)TF中,weight大于 mean(weight) + sd(weight)的targets
top3sd 保留每個(gè)TF中,weight大于 mean(weight) + 3*sd(weight)的targets
runSCENIC_3_scoreCells使用AUCell計(jì)算每個(gè)regulon的活性
AUCell calculates the enrichment of the regulon as an area under the recovery curve (AUC) across the ranking of all genes in a particular cell, whereby genes are ranked by their expression value.
In brief, the scoring method is based on a recovery analysis where the x-axis (Supplementary Fig. 1c) is the ranking of all genes based on expression level (genes with the same expression value, e.g., '0', are randomly sorted); and the y-axis is the number of genes recovered from the input set.
AUCell then uses the AUC to calculate whether a critical subset of the input gene set is enriched at the top of the ranking for each cell. In this way, the AUC represents the proportion of expressed genes in the signature and their relative expression values compared to the other genes within the cell.
get regulon activity matrix
runSCENIC_3_scoreCells將regulon打分的結(jié)果保存到'int/3.4_regulonAUC.Rds'中,我們可以讀入這個(gè)數(shù)據(jù)進(jìn)行分析。
regulonAUC <- readRDS('int/3.4_regulonAUC.Rds')
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regAct <- getAUC(regulonAUC)
dim(regAct);regAct[,1:3]
## [1] 7 200
## cells
## gene sets 1772066100_D04 1772063062_G01 1772060224_F07
## Tef (405g) 0.595558153 0.46756283 0.275862069
## Dlx5 (35g) 0.097560976 0.00000000 0.009059233
## Irf1 (105g) 0.000000000 0.05610754 0.382817066
## Stat6 (27g) 0.000000000 0.01399177 0.146502058
## Sox9 (150g) 0.210987726 0.22676797 0.049678551
## Sox10 (88g) 0.003506721 0.31092928 0.023962595
## Sox8 (98g) 0.042080655 0.26241964 0.026300409
這里我們獲得了7個(gè)regulon在200個(gè)細(xì)胞中的打分矩陣,其中每一行是一個(gè)regulon,每一列是一個(gè)細(xì)胞。以"Tef (405g)"為例,該regulon id表示的意思是轉(zhuǎn)錄因子Tef這個(gè)模塊當(dāng)前調(diào)控405個(gè)潛在的靶基因。
Optional steps
在獲取調(diào)控模塊(regulons)X 細(xì)胞(cells)的AUC矩陣后,SCENIC的主要分析已經(jīng)完成了,后續(xù)可以根據(jù)個(gè)人需求進(jìn)一步分析。例如以下的這些分析。
二值化網(wǎng)絡(luò)活性(regulon on/off)
這一步可以手動(dòng)查看每一個(gè)regulon的binarized threshold是否合理,AUCell自動(dòng)判斷的閾值有時(shí)候會(huì)過于嚴(yán)格,以致某一regulon在所有細(xì)胞都被判斷為off的狀態(tài)。
aucellApp <- plotTsne_AUCellApp(scenicOptions, exprMat_log)
savedSelections <- shiny::runApp(aucellApp)

這一步會(huì)創(chuàng)建一個(gè)shinyapp,以探索定義基因模塊激活的閾值。
# Save the modified thresholds:
newThresholds <- savedSelections$thresholds
saveRDS(newThresholds, file=getIntName(scenicOptions, "aucell_thresholds"))
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
二值化AUC
scenicOptions <- runSCENIC_4_aucell_binarize(scenicOptions)
基于調(diào)控活性的聚類與降維
我們可以根據(jù)調(diào)控模塊的活性對(duì)細(xì)胞進(jìn)行聚類。
nPcs <- c(5)
# Run t-SNE with different settings:
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50))
fileNames <- tsneAUC(scenicOptions, aucType="AUC", nPcs=nPcs, perpl=c(5,15,50), onlyHighConf=TRUE, filePrefix="int/tSNE_oHC")
# Plot as pdf (individual files in int/):
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_", list.files("int"), value=T), value=T))
par(mfrow=c(length(nPcs), 3))
fileNames <- paste0("int/",grep(".Rds", grep("tSNE_AUC", list.files("int"), value=T, perl = T), value=T))
plotTsne_compareSettings(fileNames, scenicOptions, showLegend=FALSE, varName="CellType", cex=.5)


保存tSNE的參數(shù)
scenicOptions@settings$defaultTsne$aucType <- "AUC"
scenicOptions@settings$defaultTsne$dims <- 5
scenicOptions@settings$defaultTsne$perpl <- 15
saveRDS(scenicOptions, file="int/scenicOptions.Rds")
Exploring/interpreting the results
SCENIC 分析的結(jié)果在當(dāng)前工作目錄下的 output/ 中。
將AUC和TF的表達(dá)投影到tSNE
tSNE_scenic <- readRDS(tsneFileName(scenicOptions))
aucell_regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
# Show TF expression:
par(mfrow=c(2,3))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, exprMat, aucell_regulonAUC[onlyNonDuplicatedExtended(rownames(aucell_regulonAUC))[c("Dlx5", "Sox10", "Sox9","Irf1", "Stat6")],], plots="Expression")

# Save AUC as PDF:
Cairo::CairoPDF("output/Step4_BinaryRegulonActivity_tSNE_colByAUC.pdf", width=20, height=15)
par(mfrow=c(4,6))
AUCell::AUCell_plotTSNE(tSNE_scenic$Y, cellsAUC=aucell_regulonAUC, plots="AUC")
dev.off()
## png
## 2
以密度圖的方式展示tSNE
library(KernSmooth)
library(RColorBrewer)
dens2d <- bkde2D(tSNE_scenic$Y, 1)$fhat
image(dens2d, col=brewer.pal(9, "YlOrBr"), axes=FALSE)
contour(dens2d, add=TRUE, nlevels=5, drawlabels=FALSE)

展示多個(gè)調(diào)控子
這里注意原教程寫于2020,用的是
SCENIC_1.2.0。到了2022年,SCENIC已經(jīng)更新到1.3.1.,而plotTsne_rgb已經(jīng)被替換為plotEmb_rgb
#par(bg = "black")
par(mfrow=c(1,2))
regulonNames <- c( "Dlx5","Sox10")
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="AUC", aucMaxContrast=0.6)
regulonNames <- list(red=c("Sox10", "Sox8"),
green=c("Irf1"),
blue=c( "Tef"))
cellCol <- SCENIC::plotEmb_rgb(scenicOptions, regulonNames, aucType="Binary")

GRN: Regulon targets and motifs
列出當(dāng)前基因調(diào)控網(wǎng)路中某個(gè)調(diào)控模塊下的基因:
regulons <- loadInt(scenicOptions, "regulons")
regulons[c("Dlx5", "Irf1")]
## $Dlx5
## [1] "2610203C20Rik" "Adamts17" "AI854703" "Arhgef10l"
## [5] "Bahcc1" "Cirbp" "Cplx1" "Dlx1"
## [9] "Gad1" "Gadd45gip1" "Hexim2" "Igf1"
## [13] "Iglon5" "Ltbp3" "Myt1" "Npas1"
## [17] "Nxph1" "Peli2" "Plekha6" "Prkab1"
## [21] "Ptchd2" "Racgap1" "Rgs8" "Robo1"
## [25] "Rpl34" "Sema3c" "Shisa9" "Slc12a5"
## [29] "Slc39a6" "Spns2" "Stox2" "Syt6"
## [33] "Unc5d" "Wnt5a"
##
## $Irf1
## [1] "4930523C07Rik" "Acsl5" "Adrb1" "Ahdc1"
## [5] "Ahnak" "Akap1" "Anxa2" "Arhgap31"
## [9] "Arhgef10l" "Arhgef19" "Atg14" "B2m"
## [13] "Bank1" "Bcl2a1b" "Cald1" "Capsl"
## [17] "Ccl2" "Ccl3" "Ccl7" "Ccnd1"
## [21] "Ccnd3" "Cd163" "Cd48" "Cited2"
## [25] "Cmtm6" "Ctgf" "Ctnnd1" "Ctss"
## [29] "Ddr2" "E130114P18Rik" "Egfr" "Ehd1"
## [33] "Esam" "Fcer1g" "Fcgr1" "Fgf14"
## [37] "Fstl4" "Fyb" "Gabrb1" "Gadd45g"
## [41] "Gja1" "Gpr34" "Gramd3" "H2-K1"
## [45] "Hfe" "Il6ra" "Irf1" "Itgb5"
## [49] "Kcne4" "Kcnj16" "Kcnj2" "Laptm5"
## [53] "Lhfp" "Lrp4" "Map3k5" "Mdk"
## [57] "Med13" "Midn" "Mr1" "Ms4a7"
## [61] "Msr1" "Myh9" "Nin" "Nptx1"
## [65] "Osbpl6" "P2ry13" "Parp12" "Peli2"
## [69] "Plcl2" "Plekha6" "Prkab1" "Rab3il1"
## [73] "Rgs5" "Rhobtb2" "Rnase4" "Sepp1"
## [77] "Sertad2" "Sgk3" "Slamf9" "Slc4a4"
## [81] "Slc7a7" "Slco5a1" "Slitrk2" "Soat1"
## [85] "Srgn" "St3gal6" "St5" "St8sia4"
## [89] "Stard8" "Stat6" "Stk17b" "Tapbp"
## [93] "Tbxas1" "Tgfa" "Tgfb3" "Tmem100"
## [97] "Tnfaip3" "Tnfaip8" "Trib1" "Trpm7"
## [101] "Txnip" "Usp2" "Vgll4" "Vps37b"
## [105] "Zfp36l2"
一個(gè)regulon可以對(duì)應(yīng)多個(gè)靶標(biāo),而AUC則是在靶標(biāo)大于10個(gè)的regulons中進(jìn)行運(yùn)算。
regulons <- loadInt(scenicOptions, "aucell_regulons")
head(cbind(onlyNonDuplicatedExtended(names(regulons))))
## [,1]
## Tef "Tef (405g)"
## Dlx5 "Dlx5 (35g)"
## Sox9 "Sox9 (150g)"
## Sox8 "Sox8 (98g)"
## Sox10 "Sox10 (88g)"
## Irf1 "Irf1 (105g)"
TF-target詳細(xì)的對(duì)應(yīng)關(guān)系在 output/Step2_regulonTargetsInfo.tsv 中
regulonTargetsInfo <- loadInt(scenicOptions, "regulonTargetsInfo")
tableSubset <- regulonTargetsInfo[TF=="Stat6" & highConfAnnot==TRUE]
head(tableSubset)
## TF gene highConfAnnot nMotifs bestMotif NES motifDb
## 1: Stat6 4930523C07Rik TRUE 14 tfdimers__MD00051 3.39 10kb
## 2: Stat6 Bcl2a1b TRUE 11 tfdimers__MD00051 3.39 10kb
## 3: Stat6 Ccl3 TRUE 8 tfdimers__MD00051 3.39 10kb
## 4: Stat6 Cd14 TRUE 14 tfdimers__MD00051 3.39 10kb
## 5: Stat6 Clec4a2 TRUE 1 tfdimers__MD00051 3.39 10kb
## 6: Stat6 Col27a1 TRUE 9 tfdimers__MD00051 3.39 10kb
## coexModule spearCor CoexWeight
## 1: top5perTarget 0.17259814 0.06908293
## 2: top5perTarget 0.08241394 0.15894490
## 3: top5perTarget 0.13977515 0.07461879
## 4: top5perTarget 0.07996444 0.15006861
## 5: top5perTarget 0.10529877 0.24615271
## 6: top5perTarget 0.17040821 0.18443827
Regulators for known cell types or clusters
- 每個(gè)cluster的平均Regulon活性
regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
regulonAUC <- regulonAUC[onlyNonDuplicatedExtended(rownames(regulonAUC)),]
regulonActivity_byCellType <- sapply(split(rownames(cellInfo), cellInfo$CellType),
function(cells) rowMeans(getAUC(regulonAUC)[,cells]))
regulonActivity_byCellType_Scaled <- t(scale(t(regulonActivity_byCellType), center = T, scale=T))
ComplexHeatmap::Heatmap(regulonActivity_byCellType_Scaled, name="Regulon activity")

- 二值化版本(具有調(diào)節(jié)子活性的細(xì)胞類型/簇的細(xì)胞百分比)
minPerc <- .7
binaryRegulonActivity <- loadInt(scenicOptions, "aucell_binary_nonDupl")
cellInfo_binarizedCells <- cellInfo[which(rownames(cellInfo)%in% colnames(binaryRegulonActivity)),, drop=FALSE]
regulonActivity_byCellType_Binarized <- sapply(split(rownames(cellInfo_binarizedCells), cellInfo_binarizedCells$CellType),
function(cells) rowMeans(binaryRegulonActivity[,cells, drop=FALSE]))
binaryActPerc_subset <- regulonActivity_byCellType_Binarized[which(rowSums(regulonActivity_byCellType_Binarized>minPerc)>0),]
ComplexHeatmap::Heatmap(binaryActPerc_subset, name="Regulon activity (%)", col = c("white","pink","red"))

- 細(xì)胞特異性調(diào)控因子
我們還可以計(jì)算每個(gè)regulon在細(xì)胞中的特異性系數(shù)(Regulon Specificity Score, RSS)來衡量regulon在不同細(xì)胞類型間的特異程度。
RSS = 1-sqrt(JSD)
JSD: Jensen-Shannon Divergence
In probability theory and statistics, the Jensen--Shannon divergence is a method of measuring the similarity between two probability distributions.
https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
# regulonAUC <- loadInt(scenicOptions, "aucell_regulonAUC")
rss <- calcRSS(AUC=getAUC(regulonAUC), cellAnnotation=cellInfo[colnames(regulonAUC), "CellType"])
rssPlot <- plotRSS(rss)
print(rssPlot$plot)

以上就是SECENIC的分析流程,實(shí)際上計(jì)算AUC矩陣后就結(jié)束了,后續(xù)的分析都是根據(jù)課題需求進(jìn)行的各種探索與可視化。
ref:
https://scenic.aertslab.org/tutorials/
https://github.com/aertslab/SCENIC/blob/master/R/runSCENIC_1_coexNetwork2modules.R