1 什么是motif分析

在DNA或蛋白的同源序列中，不同位點的保守程度是不一樣的，一般來說，對DNA或蛋白質(zhì)功能和結(jié)構(gòu)影響比較大的位點會比較保守，其它位點則不是很保守。這些保守的位點就稱為“模體（motif）”。motif最先是通過實驗的方法發(fā)現(xiàn)的。motif這個單詞就是形容一種反復(fù)出現(xiàn)的模式，而序列motif往往是DNA上的反復(fù)出現(xiàn)的模式，并被假設(shè)擁有生物學(xué)功能。而且，經(jīng)常是一些具有序列特異性的蛋白的結(jié)合位點（如，轉(zhuǎn)錄因子）或者是涉及到重要生物過程的（如，RNA 起始，RNA 終止， RNA 剪切等等）。目前被人們識別出來的motif也越來越多，如TRANSFAC和JASPAR數(shù)據(jù)庫都有著大量轉(zhuǎn)錄因子的motif。

2 分析motif的軟件

分析motif發(fā)軟件很多，如常見的有motif-x、、MochiView、CisGenome等。但這些軟件中大部分都是網(wǎng)頁版的，無法批量化進行分析，也很難實現(xiàn)自動化。MEME是一款比較經(jīng)典的motif分析的軟件，除了在線版本外MEME還有適用于可適用于dna、rna和蛋白序列。這款軟件包含多種功能，包括motif預(yù)測、motif富集分析、motif比較分析等。
MEME網(wǎng)址：

2.1MEME分析原理

MEME是一個工具包，包含多個軟件。其中MEME是進行motif挖掘的軟件，MEME不允許模體中有空位。MAST是在通過MEME得到一個motif之后，在其它的序列中查找這個motif，是MEME的一個后續(xù)的分析，可以在MEME運行結(jié)束后，通過超級鏈接繼續(xù)，也可以通過保存meme的文本格式文件。GLAM2類似于MEME，但允許莫提中有空位。GLAM2SCAN類似于MAST，MAST不允許模體中有空位，GLAM2SCAN允許模體中有空位。MEME有web和Linux兩個版本，web版地址：。整個工具包設(shè)計邏輯如下：

image.png

2.2MEME實現(xiàn)方法

2.2.1使用示例

meme test.fa -protein -oc result -nostatus -time 1800000 -mod zoops -nmotifs 3 -minw 6 -maxw 13 -objfun classic -markov_order 0（同web版參數(shù)）

2.2.2程序說明

-protein 待預(yù)測的為蛋白序列
-oc result 輸出路徑
-nostatus 不將軟件計算過程輸出到屏幕上
-time 1800000 CPU消耗時間達到<time>后停止計算
-mod zoops motif的分布類型
· oops 每個功能域在每一段序列中都會出現(xiàn)一次，而且只出現(xiàn)一次。這種模式是運算速度最快，而且最為敏感的。但是如果并不是每個序列都包含功能域，那就可能會有不正確的結(jié)果。
· zoops 每個功能域在每一段序列中至多只出現(xiàn)一次，可能不出現(xiàn)。這種模式運算速度較快，敏感性稍弱。
· anr 每個功能域在每一段序列中出現(xiàn)的次數(shù)不定。這種模式運算速度最慢，可能會多花十倍以上的時間。但是對于功能分布的情況完全未知的情況下，這一參數(shù)可能會有幫助
-nmotifs 3 檢測到的motif的最大限制
-minw 6 motif最大長度
-maxw 13 motif最小長度
-objfun classic motif檢測的函數(shù)算法
-markov_order 0 馬爾科夫模型使用的順序

2.2.3軟件參數(shù)詳細說明

Usage: meme <dataset> [optional arguments]
<dataset> file containing sequences in FASTA format
[-h] print this message
[-o <output dir>] name of directory for output files，will not replace existing directory
[-oc <output dir>] name of directory for output files，will replace existing directory
[-text] output in text format (default is HTML)
[-objfun classic|de|se|cd|ce] obxxxxjective function (default: classic)
[-test mhg|mbn|mrs] statistical test type (default: mhg)
[-use_llr] use LLR in search for starts in Classic mode
[-neg <negdataset>] file containing control sequences
[-shuf <kmer>] preserve frequencies of k-mers of size <kmer> ，when shuffling (default: 2)
[-hsfrac <hsfrac>] fraction of primary sequences in holdout set (default: 0.5)
[-cefrac <cefrac>] fraction sequence length for CE region (default: 0.25)
[-searchsize <ssize>]maximum portion of primary dataset to use，for motif search (in characters)
[-maxsize <maxsize>] maximum dataset size in characters
[-norand] do not randomize the order of the input ，sequences with -searchsize
[-csites <csites>] maximum number of sites for EM in Classic mode
[-seed <seed>] random seed for shuffling and sampling
[-dna] sequences use DNA alphabet
[-rna] sequences use RNA alphabet
[-protein] sequences use protein alphabet
[-alph <alph file>] sequences use custom alphabet
[-revcomp] allow sites on + or - DNA strands
[-pal] force palindromes (requires -dna)
[-mod oops|zoops|anr] distribution of motifs
[-nmotifs <nmotifs>] maximum number of motifs to find
[-evt <ev>] stop if motif E-value greater than <evt>
[-time <t>] quit before <t> CPU seconds consumed
[-nsites <sites>] number of sites for each motif
[-minsites <minsites>] minimum number of sites for each motif
[-maxsites <maxsites>] maximum number of sites for each motif
[-wnsites <wnsites>] weight on expected number of sites
[-w <w>] motif width
[-minw <minw>]     minimum motif width
[-maxw <maxw>] maximum motif width
[-allw] test starts of all widths from minw to maxw
[-nomatrim] do not adjust motif width using multiple
 alignment
[-wg <wg>] gap opening cost for multiple alignments
[-ws <ws>] gap extension cost for multiple alignments
[-noendgaps] do not count end gaps in multiple alignments
[-bfile <bfile>] name of background Markov model file
[-markov_order <order>] (maximum) order of Markov model to use or create
[-psp <pspfile>] name of positional priors file
[-maxiter <maxiter>] maximum EM iterations to run
[-distance <distance>] EM convergence criterion
[-prior dirichlet|dmix|mega|megap|addone] type of prior to use
[-b <b>] strength of the prior
[-plib <plib>] name of Dirichlet prior file
[-spfuzz <spfuzz>] fuzziness of sequence to theta mapping
[-spmap uni|pam] starting point seq to theta mapping type
[-cons <cons>] consensus sequence to start EM from
[-brief <n>] omit sites and sequence tables in output if more than <n> primary sequences
[-nostatus] do not print progress reports to terminal
[-p <np>] use parallel version with <np> processors
[-sf <sf>] print <sf> as name of sequence file
[-V] verbose mode
[-version] display the version number and exit

2.2.4結(jié)果展示及說明

meme.html -交互式的、可讀性強的HTML格式展示的結(jié)果
meme.txt -兼容早期MEME版本的純文本文件結(jié)果
meme.xmxxxxl -為機器處理設(shè)計的xmxxxxl格式的結(jié)果文件
logoN.png.eps - PNG and EPS 格式的miotif logos文件

image.png

注：氨基酸字符大小表示該位點出現(xiàn)8某種氨基酸頻率的高低

2.3 注意事項

a)MEME不支持motif中有g(shù)ap。
b)Linux下Motif檢測使用的參數(shù)同web版MEME

2.4軟件相關(guān)文獻引用

Timothy L. Bailey and Charles Elkan "Fitting a mixture model by expectation maximization to discover motifs in biopolymers" Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology pp. 28-36 AAAI Press Menlo Park California 1994.

原創(chuàng)文字，如果覺得對你有幫助留下你的贊哦~

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

motif分析-MEME

motif分析-MEME

1 什么是motif分析

2 分析motif的軟件

2.1MEME分析原理

2.2MEME實現(xiàn)方法

2.2.1使用示例

2.2.2程序說明

2.2.3軟件參數(shù)詳細說明

2.2.4結(jié)果展示及說明

2.3 注意事項

2.4軟件相關(guān)文獻引用

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

motif分析-MEME

1 什么是motif分析

2 分析motif的軟件

2.1MEME分析原理

2.2MEME實現(xiàn)方法

2.2.1使用示例

2.2.2程序說明

2.2.3軟件參數(shù)詳細說明

2.2.4結(jié)果展示及說明

2.3 注意事項

2.4軟件相關(guān)文獻引用

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av