- highly efficient assembler of RNA-Seq alignments into potential transcripts
1. 基本代碼
stringtie [-o <output.gtf>] [other_options] <read_alignments.bam>
輸入:SAM, BAM or CRAM file with RNA-Seq read alignments sorted by their genomic location
輸出:GTF
2. 大致的選項:
-h/--help:Prints help message and exits
--version:Prints version and exits
-L:long reads processing mode
--mix:mixed reads processing mode
-e:expression estimate mode,僅計算-G提供的參考本的表達(dá)量
-v:verbose mode
-o:輸出的GTF文件名
-p:多線程
-G:提供參考轉(zhuǎn)錄本
--rf:
--fr:
--ptf:點(diǎn)特征文件,點(diǎn)特征包括轉(zhuǎn)錄起始位點(diǎn)(TSS)和polyA
-l:輸出轉(zhuǎn)錄本的前綴,默認(rèn)STRG
-f:設(shè)置少見變體的最小占比。占比越小,越可能時誤差導(dǎo)致的
-m:轉(zhuǎn)錄本最小長度
-A:
-C:輸出-G提供的參考本的GTF文件
-a:spliced reads在連接點(diǎn)兩端的堿基數(shù)目要高于一定量
-j:連接點(diǎn)需要一定數(shù)目的reads
-t:關(guān)閉裁剪兩端序列
-c:最小read覆蓋率,默認(rèn)1
-s:對于單外顯子轉(zhuǎn)錄本的最小read覆蓋率,默認(rèn)4.75
--conservative:保守模式,等同于-t -c 1.5 -f 0.05
-g:最小gap長度,小于該值的reads會被融合,默認(rèn)50bp
-B:輸出output of Ballgown input table files (*.ctab) containing coverage data for the reference transcripts given with the -G option
-b:可接輸出ctab文件的路徑,最好與-e合用,不然會產(chǎn)生預(yù)測的轉(zhuǎn)租本
-M:
-x:忽略匹配到指定參考本上的reads,可以是參考序列名
-u:關(guān)閉多位點(diǎn)匹配校正,默認(rèn)如果一個read匹配到多個位置,哪個這個read的貢獻(xiàn)讀就平攤
--ref/--cram-ref:
--merge:Transcript merge mode,與以上的組裝模式不同。輸入GTF/GFF文件,產(chǎn)生a uniform set of transcripts for all samples. Output is a merged GTF file with all merged gene models, but without any numeric results on coverage, FPKM, and TPM. Then, with this merged GTF, StringTie can re-estimate abundances by running it again with the -e option on the original set of alignment files, as illustrated in the figure below.可借助-G的參考本
3. 輸入
3.1. 輸入必須是sorted,TopHat的輸出是sorted,但其他的不是,可samtools sort
3.2. 輸入序列要有tag表明是參考序列,TopHat與HISAT2會自動添加,STAR需要--outSAMstrandField intronMotif。對于長reads比對minimap2,需-ax splice
3.3. 主要的輸入?yún)?shù)有-L,-mix,-G,-e
4. 輸出
4.1. GTF文件:主要包含組裝的轉(zhuǎn)錄本,包含9列信息

4.2. 基因豐富度文件:-A選項輸出

4.3. Fully covered transcripts:-G選項輸出
GTF文件,a file with all the transcripts in the reference annotation that are fully covered, end to end, by reads
4.4. Ballgown Input Table Files
4.5. Merged GTF