91久久福利导航,欧美红色一级棒的

EffectorP軟件利用機(jī)器學(xué)習(xí)原理，通過事先收集已知的效應(yīng)子制備訓(xùn)練集，從而實(shí)現(xiàn)病原真菌和卵菌的效應(yīng)子預(yù)測^[1]。

EffectorP功能示意

EffectorP發(fā)展史^[2]：
1.0版本最初在16年發(fā)表于NEW PHYTOLOGIST，實(shí)現(xiàn)了機(jī)器學(xué)習(xí)初步預(yù)測效應(yīng)子。
2.0版本在18年發(fā)表于MPP，采用了更大的訓(xùn)練集和模型集成的方法，準(zhǔn)確度更高。
3.0版本在22年發(fā)表于MPMI，是目前最新的版本。其實(shí)現(xiàn)了效應(yīng)子的空間定位預(yù)測，同時新增了對卵菌效應(yīng)子的預(yù)測。

EffectorP正負(fù)訓(xùn)練集預(yù)測原理

EffectorP3.0正訓(xùn)練集采用64個質(zhì)外體效應(yīng)子（50個真菌效應(yīng)子，14個卵菌效應(yīng)子）和112個胞質(zhì)效應(yīng)子（77真菌，35卵菌）構(gòu)成。負(fù)訓(xùn)練集采用五種不太可能成為effectors的蛋白作為指標(biāo)。正負(fù)訓(xùn)練集中的同源重復(fù)蛋白皆被去除。EffectorP3.0通過構(gòu)建已有效應(yīng)子數(shù)據(jù)庫，訓(xùn)練機(jī)器學(xué)習(xí)模型，實(shí)現(xiàn)效應(yīng)子預(yù)測。同時推斷效應(yīng)子的定位信息。 EffectorP也有在線網(wǎng)站：https://effectorp.csiro.au/ 對代碼不感冒的小伙伴可以參考此網(wǎng)頁。

0.下載并安裝EffectorP

我們首先進(jìn)入EffectorP官網(wǎng)下載最新版本軟件：

$ git clone https://github.com/JanaSperschneider/EffectorP-3.0.git

EffectorP的運(yùn)行需要Python3環(huán)境和WEKA軟件3.8.4版本。需要我們提前在服務(wù)器上配置好。新建python3環(huán)境可以用conda實(shí)現(xiàn)。

#新建python3環(huán)境
$ conda create -y -n effector python=3
$ conda activate effector

#配置WEKA3.8.4，簡單解壓即可。
$ cd EffectorP-3.0-main && unzip weka-3-8-4.zip

通過軟件自帶的蛋白序列測試數(shù)據(jù)Effectors.fasta，測試是否運(yùn)行順利：

$ python EffectorP.py -i Effectors.fasta
#若成功運(yùn)行則會在屏幕輸出預(yù)測結(jié)果

我們可以參考官網(wǎng)給出的結(jié)果示例：

python EffectorP.py -i Effectors.fasta
-----------------

EffectorP 3.0 is running for 9 proteins given in FASTA file Effectors.fasta

Ensemble classification
25 percent done...
50 percent done...
75 percent done...
All done.

# Identifier                                    Cytoplasmic effector    Apoplastic effector     Non-effector            Prediction
AvrM Melampsora lini                            Y (1.0)                 -                       -                       Cytoplasmic effector
Avr1-CO39 Magnaporthe oryzae                    Y (0.945)               Y (0.667)               -                       Cytoplasmic/apoplastic effector
ToxA Parastagonospora nodorum                   Y (0.551)               Y (0.767)               -                       Apoplastic/cytoplasmic effector
AVR3a Phytophthora infestans                    Y (0.985)               -                       -                       Cytoplasmic effector
Pit2 Ustilago maydis                            Y (0.779)               -                       -                       Cytoplasmic effector
Zt6 Zymoseptoria tritici                        -                       Y (0.944)               -                       Apoplastic effector
INF1 Phytophthora infestans                     -                       Y (0.837)               -                       Apoplastic effector
Zinc transporter 3 Arabidopsis thaliana         -                       -                       Y (0.737)               Non-effector
GPI-anchored protein 13 Candida albicans        -                       -                       Y (0.708)               Non-effector

-----------------
9 proteins were provided as input in the following file: Effectors.fasta
-----------------
Number of predicted effectors: 7
Number of predicted cytoplasmic effectors: 4
Number of predicted apoplastic effectors: 3
-----------------
77.8 percent are predicted effectors.
44.4 percent are predicted cytoplasmic effectors.
33.3 percent are predicted apoplastic effectors.
-----------------

可見，EffectorP非常直觀地給出了各個蛋白的預(yù)測類型（是/否為效應(yīng)子）以及這些蛋白可能存在的細(xì)胞位置！

下面我們將采用發(fā)表于NCBI的稻瘟菌RNA-seq作為示例數(shù)據(jù)，執(zhí)行批量預(yù)測效應(yīng)子的操作。稻瘟菌RNA-seq的sra編號分別為SRR081552,SRR081553,SRR081554,SRR081555,SRR081556。

1.批量下載及解壓sra示例數(shù)據(jù)

首先將示例數(shù)據(jù)的sra序號存放到新的文本文件sra.txt中，執(zhí)行下載和批量解壓操作。

$ vim sra.txt
-----
SRR081552
SRR081553
SRR081554
SRR081555
SRR081556
-----
#批量下載sra.txt中的數(shù)據(jù)
$ prefetch --option-file sra.txt

下載后的SRR文件需要進(jìn)一步解壓才能得到fastq文件：

#構(gòu)建批量解壓腳本
$ vim step1_fastdump.sh
----------
#!/bin/sh
for i in `tail -n+1 sra.txt|cut -f1`;do
fastq-dump ${i} --split-3 --gzip -O ./
done
---------

#執(zhí)行批量解壓處理
$ sh step1_fastdump.sh

2.批量進(jìn)行RNAseq數(shù)據(jù)的回比

將得到的雙端測序數(shù)據(jù)/單端測序數(shù)據(jù)回比到參考基因組上，通過stringtie提取轉(zhuǎn)錄本cds序列。示例數(shù)據(jù)為雙端測序數(shù)據(jù)，故采用雙端測序數(shù)據(jù)的回比流程，此處串聯(lián)成了批量執(zhí)行腳本。
針對雙端測序的批量執(zhí)行腳本。

#構(gòu)建批量處理腳本
$ vim step2_hisat2_pair.sh
----------
#!/bin/bash
for i in `tail -n+1 sra.txt|cut -f1`;do
  {
        hisat2 -p 8 --dta --no-mixed --no-discordant  -x 70-15.BAC.fa -1 ${i}_1.fastq.cleandata.gz -2 ${i}_2.fastq.cleandata.gz --no-unal -S ${i}.sam 2>${i}.summary.txt
        samtools view -bS ${i}.sam -o ${i}.bam
        samtools sort ${i}.bam ${i}.sorted.bam #注意產(chǎn)生的是SRR081556.sorted.bam.bam
        stringtie ${i}.sorted.bam.bam -p 20 -o ${i}.gtf
        gffread -w ${i}.fa -g 70-15.BAC.fa ${i}.gtf
  }
done
---------

#批量處理
$ sh step2_hisat2_pair.sh

針對單端測序的批量執(zhí)行腳本。

#構(gòu)建批量處理腳本
$ vim step2_hisat2_single.sh
----------
#!/bin/bash
for i in `tail -n+1 sra.txt|cut -f1`;do
  {
        hisat2 -p 8 --dta --no-mixed --no-discordant  -x 70-15.BAC.fa -U ${i}.fastq.cleandata.gz --no-unal -S ${i}.sam 2>${i}.summary.txt
        samtools view -bS ${i}.sam -o ${i}.bam
        samtools sort ${i}.bam -o ${i}.sorted.bam #注意產(chǎn)生的是SRR081556.sorted.bam.bam
        stringtie ${i}.sorted.bam -p 20 -o ${i}.gtf
        gffread -w ${i}.fa -g 70-15.BAC.fa ${i}.gtf
  }
done
----------

#執(zhí)行批量處理
$ sh step2_hisat2_single.sh

3.蛋白質(zhì)翻譯和過濾

對回比上的cds序列進(jìn)行翻譯，并篩選100個AA以上的氨基酸作為候選目標(biāo)。

#構(gòu)建批量處理腳本
$ vim step3_translate.sh
----------
#!/bin/bash
for i in `tail -n+1 sra.txt|cut -f1`;do
  {
        seqkit translate ${i}.fa --trim > ${i}.pro.fa
        seqkit seq -m 100 -g ${i}.pro.fa > ${i}.pro.filter.fa
  }
done
----------

#批量處理
$ sh step3_translate.sh

4.效應(yīng)子預(yù)測

最后一步，對所有候選的目標(biāo)蛋白進(jìn)行效應(yīng)子預(yù)測。

#構(gòu)建批量處理腳本
$ vim step4_effectorP.sh
----------
#!/bin/bash
for i in `tail -n+1 sra.txt|cut -f1`;do
  {
        python /mnt/zhou/hangyuan/biosoft/EffectorP-3.0-main/EffectorP.py -i ${i}.pro.filter.fa > ${i}.predict_effector.txt
  }
done
----------

#批量處理
$ sh step4_effectorP.sh

執(zhí)行后完上述所有代碼后，得到的預(yù)測結(jié)果文件會分別保存在以.predict_effector.txt為結(jié)尾的文本文件中。打開即可查看各個RNAseq數(shù)據(jù)的預(yù)測結(jié)果。

參考信息：

EffectorP Github：https://github.com/JanaSperschneider/EffectorP-3.0
Sperschneider J, Dodds P. EffectorP 3.0: prediction of apoplastic and cytoplasmic effectors in fungi and oomycetes. Mol Plant Microbe Interact. 2021.doi: 10.1094/MPMI-08-21-0201-R

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

生信步驟|EffectorP批量預(yù)測病原物效應(yīng)子

生信步驟|EffectorP批量預(yù)測病原物效應(yīng)子

0.下載并安裝EffectorP

1.批量下載及解壓sra示例數(shù)據(jù)

2.批量進(jìn)行RNAseq數(shù)據(jù)的回比

3.蛋白質(zhì)翻譯和過濾

4.效應(yīng)子預(yù)測

參考信息：

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

生信步驟|EffectorP批量預(yù)測病原物效應(yīng)子

0.下載并安裝EffectorP

1.批量下載及解壓sra示例數(shù)據(jù)

2.批量進(jìn)行RNAseq數(shù)據(jù)的回比

3.蛋白質(zhì)翻譯和過濾

4.效應(yīng)子預(yù)測

參考信息：

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av