1.下載ref:https://github.com/ZhengXia/dapars/releases
2.github主頁:https://github.com/ZhengXia/dapars
4.dapars的谷歌論壇:https://groups.google.com/forum/#!forum/DaPars
使用:
py2.7環(huán)境下:
refbed=/media/pc/disk1/sun/refdata/ensembl_GRCm38/mm10.gencode-vm18.compre.fine.bed
genesymbol=/media/pc/disk1/sun/refdata/ensembl_GRCm38/Dapars_gene.symbol
python /home/pc/biosoft/dapars/src/DaPars_Extract_Anno.py -b $refbed -s $genesymbol -o Dapars_extracted_3UTR.bed
遇到一個問題,困擾了好久,最后發(fā)現(xiàn)是genesymbol的問題,分隔符!



下面尋找APA用的第一個方法:
'/home/pc/biosoft/APAtrap/predictAPA' -i OHT1.sorted.bedgraph OHT2.sorted.b edgraph DMSO1.sorted.bedgraph DMSO2.sorted.bedgraph -g 2 -n 2 2 -u Dapars_extracted_3UTR.bed -o APAtrap-UTR_Daprars_APA.txt
就是銜接APAtrap的第二步進行。
第二個方法:
python /home/pc/biosoft/dapars/src/DaPars_main.py configure_file
# configure_file:
Annotated_3UTR=Dapars_extracted_3UTR.bed
Group1_Tophat_aligned_Wig=OHT1.sorted.bedgraph,OHT2.sorted.bedgraph
Group2_Tophat_aligned_Wig=DMSO1.sorted.bedgraph,DMSO2.sorted.bedgraph
Output_directory=DaPars_APAresult/
Output_result_file=DaPars_APAresult
Num_least_in_group1=1
Num_least_in_group2=1
Coverage_cutoff=30
FDR_cutoff=0.05
PDUI_cutoff=0.5
Fold_change_cutoff=0.50
報錯了:

折騰了半天怪我沒好好看報錯信息:很明顯和rpy2有關(guān)!
再p2.7下安裝:
pip install rpy2 報錯顯示:rpy2已經(jīng)不支持低于py3的版本了。
那么py3.6下安裝rpy2卻又無法運行py腳本。

ref:rpy2的官方文檔:rpy2.readthedocs
最后是:pip install rpy2==1.8.4
在py2.7下運行:
python DaPars_main.py configure_file
需要把bedgraph轉(zhuǎn)換為bw文件:
ucsc下載:
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/bedGraphToBigWig
用法:
bedGraphToBigWig in.bedGraph chrom.sizes out.bw
下載size信息:
wget http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.chrom.sizes
報錯:
chrMT is not found in chromosome sizes file
從MT處,構(gòu)建ensembl的fa的chrsize:
awk 'BEGIN{OFS="\t"}{print "chr"$1,$2}' Mus_musculus.GRCm38.dna_sm.toplevel.fa.fai > Mus_musculus.GRCm38.dna_sm.toplevel.fa.chr.size
運行bedgrph到bigwig:
bedGraphToBigWig OHT1.sorted.bedgraph Mus_musculus.GRCm38.dna_sm.toplevel.fa.chr.size OHT1.sorted.wig
報錯:bedgraph沒有sort
sort -k1,1 -k2,2n unsorted.bedGraph > sorted.bedGraph
再次運行bedgrph到bigwig:
bedGraphToBigWig OHT1.sorted.fine.bedgraph Mus_musculus.GRCm38.dna_sm.toplevel.fa.chr.size OHT1.wig
bigwig轉(zhuǎn)為wig:
下載ucsc:
wget ftp://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/bigWigToWig
用法:bigWigToWig in.bigWig out.wig
bigWigToWig?OHT1.bigwig?OHT1.wig
OK!



用wig作為輸入后依然報錯:

去除#
去除>chrGL等
sed -i '/^#/d' xxx.wig
sed -i '/^chrGL/d' xxx.wig
sed -i '/^chrJH/d' xxx.wig
sed -i '/^chrMT/d' xxx.wig
換的另一個方法生成的wig文件:還未試!

換用另一個方法生成的bedgraph:直接輸入

genomeCoverageBed -bga -ibam m6a/DMSO1.sorted.bam > DMSO1.bga.bedgraph
genomeCoverageBed -bga -ibam m6a/DMSO2.sorted.bam > DMSO2.bga.bedgraph
genomeCoverageBed -bga -ibam m6a/OHT1.sorted.bam > OHT1.bga.bedgraph
genomeCoverageBed -bga -ibam m6a/OHT2.sorted.bam > OHT2.bga.bedgraph

OK!
差別主要在于bedgraph是否是連貫的,中間沒有缺失,詳見:bamtobed/bedgraph

在py中用float可以讀進去

最最終結(jié)論:修改main的py腳本:第507行改為:int(float(fields[-1])) 即可。

正常運行:
python raw_DaPars_main.py configure_file.txt
=======================================================
所以對于新安裝dapars的情況:
1.需要在py2.7下使用pip install rpy2==1.8.4安裝rpy2的低級版本
2.選擇bedtools的bam轉(zhuǎn)為bedgraph功能的-bga(和-split)模式
3.修改腳本507行增加float,匹配bedtools的浮點數(shù)輸出