charseq質控

genome:

ChAR-seq1_R1.fq

ChAR-seq1_R2.fq

1.雙端進行trimmomatic -> fq

trimmomatic PE -threads 20 -phred33 ChAR-seq1_R1.fq ChAR-seq1_R2.fq sample.R1.clean.fq?sample.R1.unpaired.fq?sample.R2.clean.fq?sample.R2.unpaired.fq ILLUMINACLIP:/home/pc/miniconda3/share/trimmomatic-0.38-0/adapters/TruSeq3-PE.fa:2:30:10:8:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 AVGQUAL:20 MINLEN:50

[R2端轉為R1]

seqkit seq sample.R2.clean.fq -r -p > sample.R2.fine.fq

2.單端進行split -> DNA and RNA fq

gzip?sample.R1.clean.fq &&

gzip?sample.R2.fine.fq &&

python /home/pc/biosoft/flypipe-master/char_bridge_trackall.py --FASTQGZ sample.R1.clean.fq.gz?? --NAME sample.R1 --minRNA 18 --minDNA 18 &&

python /home/pc/biosoft/flypipe-master/char_bridge_trackall.py --FASTQGZ sample.R2.fine.fq.gz --NAME sample.R1 --minRNA 18 --minDNA 18

3.DNA and RNA 分別mapping -> bam?

gndx=/media/pc/disk1/sun/refdata/gencode_GRCh37/03_bowtie2_index/GRCh37

bowtie2 -p 20 -D 20 -R 3 -N 1 -L 16 -i S,1,0.50 -x $gndx -U sample.R1.dna.bridgePE.fastq.gz | samtools view -bS -o sample.R1.dna.bam &&

bowtie2 -p 20 -D 20 -R 3 -N 1 -L 16 -i S,1,0.50 -x $gndx -U sample.R1.rna.bridgePE.fastq.gz | samtools view -bS -o sample.R1.rna.bam &&

bowtie2 -p 20 -D 20 -R 3 -N 1 -L 16 -i S,1,0.50 -x $gndx -U sample.R2.dna.bridgePE.fastq.gz | samtools view -bS -o sample.R2.dna.bam &&

bowtie2 -p 20 -D 20 -R 3 -N 1 -L 16 -i S,1,0.50 -x $gndx -U sample.R2.rna.bridgePE.fastq.gz | samtools view -bS -o sample.R2.rna.bam

4.過濾掉MAPQ<20 -> filtered.bam

samtools view -b -q 20 sample.R1.dna.bam > sample.R1.dna.filtered.bam &&

samtools view -b -q 20?sample.R1.rna.bam?>?sample.R1.rna.filtered.bam &&

samtools view -b -q 20?sample.R2.dna.bam?>?sample.R2.dna.filtered.bam &&

samtools view -b -q 20?sample.R2.rna.bam?>?sample.R2.rna.filtered.bam

5.去重?-> clean.bam

java -jar /home/pc/biosoft/picard.jar MarkDuplicates REMOVE_DUPLICATES=true I=sample.R1.dna.filtered.bam O=sample.R1.dna.clean.bam M=picard.sample.R1.dna.txt &&

java -jar /home/pc/biosoft/picard.jar MarkDuplicates REMOVE_DUPLICATES=true I=sample.R1.rna.filtered.bam?O=sample.R1.rna.clean.bam?M=picard.sample.R1.rna.txt &&

java -jar /home/pc/biosoft/picard.jar MarkDuplicates REMOVE_DUPLICATES=true I=sample.R2.dna.filtered.bam?O=sample.R2.dna.clean.bam?M=picard.sample.R2.dna.txt &&

java -jar /home/pc/biosoft/picard.jar MarkDuplicates REMOVE_DUPLICATES=true I=sample.R2.rna.filtered.bam?O=sample.R2.rna.clean.bam?M=picard.sample.R2.rna.txt?

6.提取mapped的sam -> mapped.sam

samtools view -F 4 sample.R1.dna.clean.bam > sample.R1.dna.clean.sam &&

samtools view -F 4?sample.R1.rna.clean.bam?>?sample.R1.rna.clean.sam &&

samtools view -F 4?sample.R2.dna.clean.bam?>?sample.R2.dna.clean.sam &&

samtools view -F 4?sample.R2.rna.clean.bam?>?sample.R2.rna.clean.sam

7.轉為simple txt -> txt

awk '{if($2 == 0) sign="+"; else if ($2 == 16) sign="-"; else sign="."; print $1,$3,$4,$5,$6,length($10),sign}' sample.R1.dna.clean.sam > sample.R1.dna.txt &&

awk '{if($2 == 0) sign="+"; else if ($2 == 16) sign="-"; else sign="."; print $1,$3,$4,$5,$6,length($10),sign}'?sample.R1.rna.clean.sam?>?sample.R1.rna.txt &&

awk '{if($2 == 0) sign="+"; else if ($2 == 16) sign="-"; else sign="."; print $1,$3,$4,$5,$6,length($10),sign}'?sample.R2.dna.clean.sam?>?sample.R2.dna.txt &&

awk '{if($2 == 0) sign="+"; else if ($2 == 16) sign="-"; else sign="."; print $1,$3,$4,$5,$6,length($10),sign}'?sample.R2.rna.clean.sam?>?sample.R2.rna.txt

transcript:

1.RNA fq mapping -> bam

2.過濾掉MAPQ<20 -> filtered.bam

3.去重 -> clean.bam

4.提取mapped的sam -> sam

5.轉為simple txt -> txt

最后編輯于
?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容