在單細(xì)胞cellranger軟件建庫的時候必須要用GTF文件,但是有時候有些基因組只有GFF,這時候就需要轉(zhuǎn)換一下格式。
1、GTF的格式類型
GTF3 (9 feature types accepted): gene, transcript, exon, CDS, Selenoproteine, start_codon, stop_codon, three_prime_utr and five_prime_utr
GTF2.5 (8 feature types accepted): gene, transcript, exon, CDS, UTR, start_codon, stop_codon, Selenoproteine
GTF2.2 (9 feature types accepted): CDS, start_codon, stop_codon, 5UTR, 3UTR, inter, intron_CNS, intron CNS and exon
GTF2.1 (6 feature types accepted): CDS, start_codon, stop_codon, exon, 5UTR, 3UTR
GTF2 (4 feature types accepted): CDS, start_codon, stop_codon, exon
GTF1 (5 feature types accepted): CDS, start_codon, stop_codon, exon, intron
2、格式轉(zhuǎn)換
1)、gffread
這里使用的版本是gffread v0.12.7
gffread test.gff3 -T -o test.gtf
這種方法輸出的gtf版本是2.2,結(jié)果大概長這樣:

2)、agat_convert_sp_gff2gtf.pl from AGAT
#下載https://github.com/NBISweden/AGAT
# 這里我直接下載的鏡像使用
singularity pull docker://quay.io/biocontainers/agat:1.0.0--pl5321hdfd78af_0
singularity exec /path/to/Software/agat_1.0.0--pl5321hdfd78af_0.sif agat_convert_sp_gff2gtf.pl --gff test.gff3 -o test.gtf

默認(rèn)輸出的就是GTF3的格式,結(jié)果如下:

3)、其他(未嘗試)
genome tools
http://genometools.org/tools/gt_gff3_to_gtf.html
ea-utils
https://github.com/ExpressionAnalysis/ea-utils/blob/master/clipper/gff2gtf
pasa
https://github.com/PASApipeline/PASApipeline/blob/master/misc_utilities/gff3_to_gtf_format.pl
kent utils:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
gff3ToGenePred followed by genePredToGtf
GFFtools-GX
https://github.com/vipints/GFFtools-GX/blob/master/gff_to_gtf.py