99re2在线精品,久久久久久久香蕉视频

軟件簡介

Celescope可從二代測序下機的原始fastq數(shù)據(jù)開始處理，經(jīng)過細胞標簽的提取、質(zhì)控與校正，測序數(shù)據(jù)質(zhì)控，參考基因組比對，基因定量，UMI糾錯與計數(shù)后確定細胞數(shù)，最終得到數(shù)據(jù)的質(zhì)控報告和細胞的表達矩陣，用于后續(xù)分析，具有靈活、準確、全面的特點，是非常有力的單細胞轉(zhuǎn)錄組測序數(shù)據(jù)處理軟件。

環(huán)境配置

conda

linux

minimum 32GB RAM(to run STAR aligner)

下載安裝celescope

編寫運行如下代碼進行下載安裝：

git clone https://github.com/zhouyiqi91/CeleScope.git

cd CeleScope

source setup.sh

如果沒有報錯，就說明celescope安裝成功。

下載參考基因組并生成index文件

不管用什么軟件，做什么分析，參考基因組都是必不可少的。

從ensembl官網(wǎng)下載人類基因組的參考序列文件（.fa）和基因組注釋文件（.gtf）：

wgetftp://ftp.ensembl.org/pub/release-99/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

wgetftp://ftp.ensembl.org/pub/release-99/gtf/homo_sapiens/Homo_sapiens.GRCh38.99.gtf.gz

解壓參考基因組文件到指定文件夾：

mkdir -preferences/Homo_sapiens/Ensembl/GRCh38

gzip -c -d Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz> references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.fa

gzip -c -d Homo_sapiens.GRCh38.99.gtf.gz >

references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.99.gtf

Note：運行celescope要激活conda環(huán)境。

調(diào)用STAR生成參考基因組的index文件。

conda activate celescope

gtfToGenePred -genePredExt -geneNameAsName2references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.99.gtf /dev/stdout |\

???awk '{print$12"\t"$1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7"\t"$8"\t"$9"\t"$10}'> references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.99.refFlat

STAR --runMode genomeGenerate\

--runThreadN 6\? ? ?

--genomeDir references/Homo_sapiens/Ensembl/GRCh38 \ ???

--genomeFastaFiles references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.fa \ ???

--sjdbGTFfile references/Homo_sapiens/Ensembl/GRCh38/Homo_sapiens.GRCh38.99.gtf \

--sjdbOverhang 100?

STAR結(jié)果中會生成reference文件夾，里面包含人類基因組的index信息，如染色體、外顯子等。

$ ls celescope_test/references/Homo_sapiens/Ensembl/GRCh38/

chrLength.txt? ? ? exonGeTrInfo.tab? genomeParameters.txt? ? ? ? ? ? SA? ? ? ? ? ? ? ? ? ? ? ? sjdbList.out.tab

chrNameLength.txt? exonInfo.tab? ? ? Homo_sapiens.GRCh38.99.gtf? ? ? SAindex? ? ? ? ? ? ? ? ? transcriptInfo.tab

chrName.txt? ? ? ? geneInfo.tab? ? ? Homo_sapiens.GRCh38.99.refFlat? sjdbInfo.txt

chrStart.txt? ? ? Genome? ? ? ? ? ? Homo_sapiens.GRCh38.fa? ? ? ? ? sjdbList.fromGTF.out.tab

小鼠及其他物種的參考基因組下載和index文件生成方法同理。

至此celescope分析的前期準備工作已經(jīng)差不多完成了，下面開始正式分析。

Celescope可以用于Single cell RNA-seq，Single cell VDJ和Single cell Multiplexing。

Single cell RNA-seq（單細胞轉(zhuǎn)錄組分析）

激活conda環(huán)境：

conda activate celescope

編寫如下腳本進行單樣本分析：

celescope rna run\??

?--fq1/SGRNJ/DATA_PROJ/2003/20200710/LC20062911_2_L1_1.fq.gz\

?--fq2/SGRNJ/DATA_PROJ/2003/20200710/LC20062911_2_L1_2.fq.gz\

?--genomeDir /SGR/references/Homo_sapiens/Ensembl/GRCh38\

?--sample BEPM\

?--thread 4\

--chemistry auto

NOTE：運行之前要先下載好fastqc軟件

輸入：

--fq1 雙端測序FASTQ read 1的路徑

--fq2 雙端測序FASTQ read 2的路徑

--genomeDIR 參考基因組的路徑

--sample 樣本名

--thread 分析使用的線程數(shù)。在RNA-seq分析中最好不要超過8個，否則容易報錯

Single cell RNA-seq還支持多樣本運行模式，接口為multi_{assay}。

編寫并運行如下腳本：

multi_rna\

?--mapfile /SGRNJ02/RandD4/test/20200713.mapfile\

?--chemistry scopeV2.1.1\

?--genomeDir/SGRNJ/Public/Database/genome/homo_mus\

?--thread 4\

?--modshell

輸入：

--mapfile：包含三列, 每列之間用tab分割；每一行是一個樣本。

第一列：fastq前綴

第二列：fastq所在文件夾

第三列：{sample}（即生成文件的前綴）

第四列：可選，期望細胞數(shù)（scRNA-Seq）或者match_dir(scVDJ) 注意：當一個樣本有多個fastq，且這些fastq不在同一個文件夾下時，每個fastq占一行，sample名稱相同即可。

mapfile示例：

$ cat /SGRNJ02/RandD4/test/20200713.mapfile

LC20062911????? /SGRNJ/DATA_PROJ/2003/20200710? S062907-3

$ ll/SGRNJ/DATA_PROJ/2003/20200710

total 26181688

-rw-r--r--. 1download ssh.bioinfo 3056870860 Jul 10 13:52 LC20062911_2_L1_1.fq.gz

-rw-r--r--. 1download ssh.bioinfo 3105319350 Jul 10 14:04 LC20062911_2_L1_2.fq.gz

運行后會在當前目錄下生成一個shell文件夾，里面包含一個與sample名相同的shell腳本

$ ls -l

-rw-r-----. 1zhouxin ssh.randd 1504 Dec? 9 14:20S062907-3.sh

在根目錄下運行該腳本即可開始RNA-seq分析：

$ shshell/S062907-3.sh

輸出：

Single cell RNA-seq輸出文件包括每一步的結(jié)果文件和HTML報告。HTML報告可以查看每一步分析結(jié)果的統(tǒng)計，點擊標題旁的？可以查看每個分析指標的具體含義。基因表達矩陣在05.count目錄下。

$ ll S062907-3

total 5028

drwxr-xr-x. 2zhouxin ssh.randd????? 30 Dec? 9 14:53 00.sample

drwxr-xr-x. 2zhouxin ssh.randd???? 128 Dec? 9 15:31 01.barcode

drwxr-xr-x. 2zhouxin ssh.randd????? 89 Dec? 9 15:54 02.cutadapt

drwxr-xr-x. 3zhouxin ssh.randd??? 4096 Dec? 9 17:50 03.STAR

drwxr-xr-x. 2zhouxin ssh.randd???? 186 Dec? 9 17:54 04.featureCounts

drwxr-xr-x. 3zhouxin ssh.randd???? 243 Dec? 9 18:12 05.count

drwxr-xr-x. 2zhouxin ssh.randd????? 99 Dec? 9 18:17 06.analysis

-rw-r--r--. 1zhouxin ssh.randd 5141357 Dec? 9 18:17S062907-3_report.html

$ ll 05.count/

total 367216

-rw-r--r--. 1 zhouxin ssh.randd???? 20154 Dec?9 18:02 barcode_filter_magnitude.pdf

-rw-r--r--. 1 zhouxin ssh.randd 355985586Dec? 9 17:57 S062907-3_count_detail.txt

-rw-r--r--. 1 zhouxin ssh.randd? 17730119 Dec?9 18:02 S062907-3_counts.txt

-rw-r--r--. 1 zhouxin ssh.randd?????? 229 Dec?9 18:12 S062907-3_downsample.txt

drwxr-xr-x. 2 zhouxin ssh.randd??????? 77 Dec?9 18:03 S062907-3_matrix_10X

-rw-r--r--. 1 zhouxin ssh.randd? ?2279564 Dec?9 18:05 S062907-3_matrix.tsv.gz

-rw-r--r--. 1 zhouxin ssh.randd?????? 176 Dec?9 18:12 stat.txt

NOTE：

S062907-3_matrix.tsv.gz：tab分隔的表達矩陣，行名是HGNC gene symbol, 列名是cell barcode。

S062907-3_matrix_10X：10X格式的表達矩陣，可以用Seurat的Read10X函數(shù)讀入。

Single cell VDJ（單細胞免疫組庫分析）

激活conda環(huán)境：

conda activatecelescope

編寫并運行如下腳本：

celescope vdjrun??

--fq1/SGRNJ/DATA_PROJ/2003/20200710/LC20062911_2_L1_1.fq.gz\

?--fq2/SGRNJ/DATA_PROJ/2003/20200710/LC20062911_2_L1_2.fq.gz\

?--sample S062907-3\

?--thread 8\

?--type TCR\

?--chemistry auto

輸入：

--fq1 FASTQ read1文件

--fq2 FASTQ read2文件

-- sample 樣本名

-- thread 線程數(shù)

--type T細胞或B細胞受體

--match_dir 與RNA-seq目錄匹配

輸出：

$ ll S062907-3/

total 2888

drwxr-xr-x. 2zhouxin ssh.randd????? 30 Dec? 9 15:08 00.sample

drwxr-xr-x. 2zhouxin ssh.randd???? 128 Dec? 9 16:46 01.barcode

drwxr-xr-x. 2zhouxin ssh.randd????? 89 Dec? 9 17:05 02.cutadapt

drwxr-xr-x. 2zhouxin ssh.randd???? 235 Dec? 9 18:41 03.mapping_vdj

drwxr-xr-x. 2zhouxin ssh.randd???? 194 Dec? 9 18:41 04.count_vdj

-rw-r--r--. 1zhouxin ssh.randd 2956241 Dec? 9 18:41S062907-3_report.html

04.count_vdj目錄包含如下文件：

S062907-3_cell_confident.tsv：VDJ cell barcode的克隆型，每一條鏈（TRA和TRB）分別占一行。

S062907-3_cell_confident_count.tsv：VDJ cell barcode的克隆型，每個細胞占一行。

S062907-3_clonetypes.tsv：VDJ cell barcode每種克隆型的計數(shù)和百分比。

S062907-3_match_clonetypes.tsv：VDJ cell barcode與sc-RNA-Seq cell

barcode交集的每種克隆型的計數(shù)和百分比。當提供了match_dir參數(shù)時，才會生產(chǎn)該文件。

Single cell Multiplexing（拆分）

激活conda環(huán)境：

conda activatecelescope

編寫并運行如下代碼：

celescope smkrun\??

?--fq1 {smk fq1.gz}\

?--fq2 {smk fq2.gz}\

?--sample {sample name}\

?--SMK_pattern L25C45\

?--SMK_barcode {SMK barcode fasta}\

?--SMK_linker {SMK linker fasta}\

?--match_dir {match_dir}\

?--dim 2\

?--combine_cluster {combine_cluster.tsv}

輸入：

--SMK_pattern 必需參數(shù)。L25C45 指25 bp linker + 45 bpcell barcode

C: cell barcode

U: UMI

T: polyT

L: linker

--SMK_barcode 必需參數(shù)，標簽的fa文件

--SMK_linker 必需參數(shù)，linker的fa文件

--match_dir 必需參數(shù)，運行完celescope后與scRNA-seq目錄進行匹配

--dim 必需參數(shù)，規(guī)定緯度

--combine_cluster 可選參數(shù)，整合cluster文件

第一列：原始cluster數(shù)

第二列：整合后的cluster數(shù)

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

celescope使用說明

celescope使用說明

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

celescope使用說明

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av