簡介:
- https://www.nature.com/articles/s41592-020-0748-5
- https://www.biorxiv.org/content/10.1101/519660v1.full.pdf
簡單說,就是如果覺得目前的聚類軟件的分類效果不太好,可以用這個軟件用可視化進化分支的形式將細胞分群
詳細教程
- 生信技能樹的:https://cloud.tencent.com/developer/article/1605955
- Githubu原文:https://github.com/GregorySchwartz/too-many-cells
安裝
由于不想安裝那么多的依賴包,下面的操作全部基于Docker
docker pull gregoryschwartz/too-many-cells:0.2.2.0
啟動docker容器
docker run -it --rm -v "/home/luohb:/share/nas1/Data/Users/luohb/Personalization/20191206/TooManyCells" gregoryschwartz/too-many-cells:0.2.2.0 -h
too-many-cells, Gregory W. Schwartz. Clusters and analyzes single cell data.
Usage: too-many-cells (make-tree | interactive | differential | diversity |
paths)
Available options:
-h,--help Show this help text
Available commands:
make-tree
interactive
differential
diversity
paths
輸入文件構建
這里輸入既可以是一個文件夾(里面放 10X cellranger 的 3 個文件),也可以是一個 csv 格式的普通表達矩陣
1. 矩陣:
PS:如果是一個count矩陣文件記得第一行的第一列是逗號,行名標簽和列標簽可以沒有雙引號
"","A22.D042044.3_9_M.1.1","C5.D042044.3_9_M.1.1","D10.D042044.3_9_M.1.1","E13.D042044.3_9_M.1.1","F19.D042044.3_9_M.1.1","H2.D042044.3_9_M.1.1","I9.D042044.3_9_M.1.1",...
"0610005C13Rik",0,0,0,0,0,0,0,...
"0610007C21Rik",0,112,185,54,0,96,42,...
"0610007L01Rik",0,0,0,0,0,153,170,...
"0610007N19Rik",0,0,0,0,0,0,0,...
"0610007P08Rik",0,0,0,0,0,19,0,...
"0610007P14Rik",0,58,0,0,255,60,0,...
"0610007P22Rik",0,0,0,0,0,65,0,...
"0610008F07Rik",0,0,0,0,0,0,0,...
"0610009B14Rik",0,0,0,0,0,0,0,...
...
2. 標簽文件
item,label
AAACCTGCAGTAACGG-1,Marrow
AAACGGGAGACCGGAT-1,Marrow
AAACGGGAGCGCTCCA-1,Marrow
AAACGGGAGGACGAAA-1,Marrow
AAACGGGAGGTACTCT-1,Marrow
...
這里的標簽文件,可以是細胞的樣本來源信息,或者認為分群的標簽,只作為最后上色的結(jié)果,不影響最后進化樹的分支結(jié)構
運行
docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 make-tree \
--matrix-path /test/count.csv \
--labels-file /test/OrigIdent.labels.csv \
--draw-collection "PieRing" \
--output /test/LabelsBySamples > log
結(jié)果類似這樣

“修剪”分支
默認參數(shù)下的分支太細了,可以通過兩種方式來調(diào)整:
- 直接設置
--min-size:規(guī)定最小分支細胞數(shù)。使用參數(shù)將葉子的最小大小設置為100個細胞 - 設置
--smart-cutoff,通過 n*中位數(shù)絕對偏差(MAD) ,改變樹上葉子的數(shù)量。
可以結(jié)合--min-size,--max-proportion,--min-distance,或--min-distance-search一起用
另外,我們不需要重新計算整個樹!我們可以使用參數(shù)--prior來提供以前的結(jié)果(我們也可以用--prior刪除--matrix-path 來加快處理速度,不過可能會失去某些功能特性)
docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 make-tree \
--prior /test/LabelsBySamples --labels-file /test/OrigIdent.labels.csv \
--smart-cutoff 1 --min-size 1 \
--draw-collection "PieChart" #末端改成餅圖 \
--output /test/pruned_LabelsBySamples > log1_2
最后結(jié)果類似

提取子集
cp log3_2 clusters_pruned.csv
vi clusters_pruned.csv
# vim中
%s/^M$//g
各個節(jié)點的結(jié)果在Docker中會顯示有些問題,需要手動修改成以下形式
$ head clusters_pruned.csv
cell,cluster,path
AAACGGGAGGTGTTAA.1,9,9/8/7/6/5/4/3/2/1/0
AACACGTTCGGCGGTT.1,9,9/8/7/6/5/4/3/2/1/0
AACCGCGGTATATGAG.1,9,9/8/7/6/5/4/3/2/1/0
ACACCCTTCTGGTTCC.1,9,9/8/7/6/5/4/3/2/1/0
ACCTTTAAGGTGTTAA.1,9,9/8/7/6/5/4/3/2/1/0
ACGAGGACACGTTGGC.1,9,9/8/7/6/5/4/3/2/1/0
AGGGAGTCAGGCTCAC.1,9,9/8/7/6/5/4/3/2/1/0
AGGGATGAGCGATAGC.1,9,9/8/7/6/5/4/3/2/1/0
AGTGGGAAGATGTAAC.1,9,9/8/7/6/5/4/3/2/1/0
標注上節(jié)點信息
docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 make-tree
--prior /test/LabelsBySplitGroup \
--labels-file /test/SplitGroup.labels.csv --smart-cutoff 1 --min-size 1 --draw-collection "PieChart" \
--draw-node-number #加上節(jié)點信息\
--output /test/number_pruned_LabelsBySplitGroup > log7

然后可以根據(jù)節(jié)點對應的barcode去提取細胞子集
基因表達情況
docker run -it --rm -v /share/nas1/Data/Users/luohb/Personalization/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 make-tree \
--prior /test/LabelsBySplitGroup \
--matrix-path /test/count.csv \
--labels-file /test/SplitGroup.labels.csv \
--smart-cutoff 1 \
--min-size 1 \
--draw-leaf "DrawItem (DrawThresholdContinuous [(\"gene1\", 0), (\"gene2\", 0)])" \
--draw-colors "[\"#e41a1c\", \"#377eb8\", \"#4daf4a\", \"#eaeaea\"]"\
--draw-scale-saturation 10 \
--output /test/out_gene_expression \
> clusters_pruned_gene_expression.csv
結(jié)果類似

差異基因分析
根據(jù)提供的標簽進行差異基因分析
兩個節(jié)點之間的差異分析
$docker run -it --rm -v /share/nas1/Data/Users/luohb/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 differential \
--prior /test/LabelsBySplitGroup \
--matrix-path /test/count.csv \
--labels-file /test/SplitGroup.labels.csv \
-n "([70, 3, 105, 166], [45])" \
> clusters_pruned_gene_expression.csv
對所有節(jié)點進行查找Marker基因
$cat run12.sh
$docker run -it --rm -v /share/nas1/Data/Users/luohb/TooManyCells/test:/test \
gregoryschwartz/too-many-cells:0.2.2.0 differential \
--prior /test/LabelsBySplitGroup \
--matrix-path /test/count.csv \
-n "([], [])" \
--normalization "UQNorm" \
+RTS -N26
--plot-output /test/plot.pdf
-t 5 #限定節(jié)點層級
$sh run12.sh >FindAllMarker.txt