利用核酸序列估算分歧時(shí)間

需要的文件

1.帶有話是標(biāo)定點(diǎn)的物種樹(shù)
2.比對(duì)好的phylip格式序列文件

化石標(biāo)定點(diǎn)物種樹(shù)(刪除不需要的枝長(zhǎng)等信息)

 86 1

(((((((((((((((((Distichlis_spicata,Distichlis_littoralis),Distichlis_bajaensis),((Bouteloua_dactyloides,Bouteloua_curtipendula),Bouteloua_gracilis)),(Hilaria_cenchroides,Hilaria_rigida)),(Muhlenbergia_huegelii,Muhlenbergia_japonica)),Tragus_berteronianus),Tridens_brasiliensis),((((((Oropetium_thomaeum,Oropetium_aristatum),Tripogon_chinensis),Tripogonella_loliiformis),Melanocenchris_abyssinica),Desmostachya_bipinnata),Halopyrum_mucronatum)),(((((Perotis_indica,Perotis_rara),Perotis_hildebrandtii),(Trichoneura_grandiglumis,Trichoneura_ciliata)),Vaseyochloa_multinervosa),((Dactyloctenium_radulans,Dactyloctenium_aegyptium),Odyssea_paucinervis))),((Orinus_thoroldii,Triodia_rigidissima),Cleistogenes_squarrosa)),(((((((((((Cynodon_radiatus,Cynodon_dactylon),Eustachys_glauca),(Microchloa_indica,Oxychloris_scariosa)),Lepturus_repens),((((Chloris_virgata,Enteropogon_dolichostachyus),Chloris_truncata),Chloris_barbata),Enteropogon_ramosus)),Astrebla_pectinata),(Eleusine_coracana,Eleusine_indica)),((Dinebra_retroflexa,Dinebra_panicea),Dinebra_chinensis)),Acrachne_racemosa),Diplachne_fusca),((Aeluropus_lagopoides,Aeluropus_littoralis),Aeluropus_sinensis))),((((((((Sporobolus_alterniflorus,Sporobolus_maritimus),Sporobolus_michauxianus),Sporobolus_heterolepis),Sporobolus_maximus),((Sporobolus_virginicus,Sporobolus_helvolus),Sporobolus_aculeatus)),(Sporobolus_fertilis,Sporobolus_diandrus)),Urochondra_setulosa),(((Zoysia_matrella,Zoysia_pacifica),(Zoysia_japonica,Zoysia_sinica)),(Zoysia_macrostachya,Zoysia_macrantha)))‘>29.49<32.28’),((((((Eragrostis_cilianensis,Eragrostis_pilosa),Eragrostis_ferruginea),(Eragrostis_minor,Eragrostis_autumnalis)),(Eragrostis_atrovirens,Harpachne_harpachnoides)),(Tetrachne_dregei,Uniola_paniculata)),(Enneapogon_desvauxii,Schmidtia_pappophoroides))'>32.76<35.29'),(Triraphis_mollis,Neyraudia_reynaudiana)),Centropodia_glauca)'>42.87<43',Coelachyrum_piercei),Cortaderia_selloana)‘>54.44<63.35’;

準(zhǔn)備phylip格式序列文件

例如:Zoysia_sinica.fasta序列內(nèi)部名字為
>gi|1642520764|ref|NC_042187.1| Zoysia sinica chloroplast, complete genome
GAAATACCCAATATCCTGTTGGAACAAGATATTGGGTATTTCTGGCTTTCCTTCCTTTAAAAATTCCTAT
ATTTTAGGAGAAAAACCTTATCCATTAAGAGATGGAACTTCAAGAGCAGCTAAGTCTAGAGGGAAGTTGT
GAGCATTACGTTCGTGCATTACTTCCATACCAAGATTAGCACGGTTGATGATATCAGCCCAAGTATTAAT

修改fa文件內(nèi)部序列名字和外部名字不統(tǒng)一

cat *.fasta| sed 's/.fasta//g' >species.list
for species in $(cat species.list); do cat ./$species.fasta | seqkit seq -n | awk '{print $1}' | sed "s/gi.*/$species/g" > t1; cat ./$species.fasta | seqkit seq -s -w 0 > t2; paste t1 t2 | seqkit tab2fx | seqkit seq -w 0 > $species.fas; rm t1 t2; done

結(jié)果

less Zoysia_sinica.fas
>Zoysia_sinica
GAAATACCCAATATCCTGTTGGAACAAGATATTGGGTATTTCTGGCTTTCCTTCCTTTAAAAATTCCTATATTTTAGGAGAAAAACCTTATCCATTAAGAGATGGAACTTCAAGAGCAGCTAAGTCTAGAGGGAAGTTGTGAGCATTACGTTCGTGCATTACTTCCATACCAAGATTAGCACGGTTGATGATATCAGCCCAAGTATTAATAACGCGACCTTGGCTATCAACTACAGATTGGTTGAAATTGAAACCATTTAGGTTGAATGCCATAGTACTAATACCTAAAGCAGTGAACCAGATCCCTACTACAGGCCAAGCAGCCAAGAAGAAGTGTAAAGAACGAGAGTTGTTGAAACTAGCATATTGGAAGATTAATCGACCAAAATAACCGTGAGCAGCCACAATGTTATAAGTCTCTTCCTCTTGACCAAATTTGTAACCCTCATTAGCAGATTCATTTTCAGTGGTTTCCCTGATCAAACTAGAGGTTACCAAGGAACCATGCATAGCACTGAATAGGGAACCGCCGAATACACCAGCTACACCTAACATGTGAAATGGATGCATAAGGATGTTGTGCTCTGCCTGGAATACAATCATAAAGTTGAAAGTACCAGAGATTCCTAAAGGCATACCATCAGAGAAACTTCCTTGACCAATAGGGTAAATCAAGAAAACAGCAGTAGCAGCTGCAACAGGAGCTGAATATGCAACAGCAATCCAAGGACGCATACCCAGACGGAAACTAAGTTCCCACTCACGACCCATATAACAAGCTACACCAAGTAAGAAGTGTAGAACAATTAGCTCATAAGGACCACCAT

比對(duì)

/home/lx_sky6/yt/soft/miniconda3/bin/mafft --thread 30 86.fas > 86.mafft.fas

裁剪保存為phylip_paml格式

trimal -in 86.mafft.fas -out 86.trimal.fas -automated1 -phylip_paml

運(yùn)行mcmctree(一共運(yùn)行3次,第一次輸出out.BV文件)

mcmctree mcmctree.ctl

          seed = -1
       seqfile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/86.trimal.phy
      treefile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/input.tree
       outfile = out.txt

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 3    * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 2    * 1: global clock; 2: independent rates; 3: correlated rates
       RootAge =   * safe constraint on root age, used if no fossil for root.

         model = 7    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0.5    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 1 1 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha

   rgene_gamma = 2 2   * gamma prior for overall rates for genes
  sigma2_gamma = 1 10   * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: .1  .1  .1  .1 .01 .5  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 10000
      sampfreq = 5
       nsample = 30000

*** Note: Make your window wider (100 columns) before running the program.

再次運(yùn)行mcmctree(第二次修改out.BV為in.BV作為輸入,即修改mcmctree.ctl文件中usedata = 2為usedata = 3)

mv out.BV in.BV
mcmctree mcmctree.ctl
          seed = -1
       seqfile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/86.trimal.phy
      treefile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/input.tree
       outfile = out.txt

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 2    * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 2    * 1: global clock; 2: independent rates; 3: correlated rates
       RootAge =   * safe constraint on root age, used if no fossil for root.

         model = 7    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0.5    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 1 1 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha

   rgene_gamma = 2 2   * gamma prior for overall rates for genes
  sigma2_gamma = 1 10   * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: .1  .1  .1  .1 .01 .5  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 10000
      sampfreq = 5
       nsample = 30000

*** Note: Make your window wider (100 columns) before running the program.

再次運(yùn)行mcmctree(第3次相對(duì)于第二次不做修改)

mcmctree mcmctree.ctl
          seed = -1
       seqfile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/86.trimal.phy
      treefile = /home/lx_sky6/yt/0105_xianyu/0105_mcmc/input.tree
       outfile = out.txt

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 2    * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 2    * 1: global clock; 2: independent rates; 3: correlated rates
       RootAge =   * safe constraint on root age, used if no fossil for root.

         model = 7    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0.5    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 1 1 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha

   rgene_gamma = 2 2   * gamma prior for overall rates for genes
  sigma2_gamma = 1 10   * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: .1  .1  .1  .1 .01 .5  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 10000
      sampfreq = 5
       nsample = 30000

*** Note: Make your window wider (100 columns) before running the program.

最后將第二次和第三次運(yùn)行結(jié)果的mcmc.txt文件導(dǎo)入tracer軟件,如果ess值均大于200,且兩次結(jié)果差異不大,則認(rèn)為樹(shù)可信。

image.png

如果小于200,則需要增加代數(shù)從新運(yùn)行第二次和第三次,直到ESS>200。


image.png
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容