What are eigengenes and gene modules?

最近在學(xué)習(xí)WGCNA時,遇到幾個名詞不是太懂是什么意思,在網(wǎng)上找了一下,發(fā)現(xiàn)有個答案但是正常上網(wǎng)是進不去,所以我就把他拿過來共有需要的人學(xué)習(xí)學(xué)習(xí)

one people

I am not sure in what context you are referring to the terms 'eigengene' and 'gene module'. But my best guess is you are talking about it in the context of WGCNA(Weighted Gene Co-expression Network Analysis).

If you want to do a wgcna analysis on a gene expression dataset, the general principle is, first, you build a correlation network between the genes based on their co-expression where a gene is a node and you put an edge between 2 genes if it passes a set threshold of co-expression strength. Sometimes people build a Topological Overlap Matrix (TOM)[1]on top of the correlation network but you do not need to worry about that at the moment. After you get a network, you do hierarchical clustering[2]on the most connected genes. This is an unsupervised learning method where a tree is built up from bottom to top by connecting the 2 most nearest genes in terms of a distance that you decide. That way when your tree is built you will have a number of clusters where the genes are tightly connected.

After getting the tree, you cut the tree at a certain distance, again why and how you do that is beautifully explained on the number 2 reference I provided. After cutting the tree, you get a number of modules where the genes are highly connected and may provide biological insights. These modules are called "gene modules".

When you want to compare one gene module against another, it can be advantageous to take only a representative of that module rather than taking all the genes. That is when you do a Principal Component Analysis[3] which can reduce your data meaningfully and then you take the first principal component as a summary of that module. This first principal component is called "eigengene" in this context.

You can find all of the necessary terminology regarding wgcna here[4]. A brilliant tutorial with every step of the WGCNA analysis can be found here[5]. It is written by the authors of the WGCNA R package.

another people

Genomic data such as gene expression data and variant data have very high dimensionality, i.e. there are too many variables, and few data points. When you have a gene expression dataset, you may be interested in identifying groups of genes which show similar expression patterns.

One of the ways to do this is WGCNA or weighted gene coexpression network analysis. In simple terms, what you're trying to do is identify genes which show similar expression patterns across samples or conditions. These gene groups are called modules. WGCNA identifies modules by using a type of Principle component analysis (PCA). Here, each module is represented by an expression value which belongs to the module 'eigengene'. This value is identified from the PCA. None of the actual genes in the module need to actually have this expression value.

Since each eigengene represents a module, the distance a gene from the eigengene, and therefore the centre of the module, can be calculated. This tells us which module each gene lies in.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi閱讀 7,841評論 0 10
  • 很累,一覺到了晚飯時間。不知怎么的,今天我對父親說“在你眼中媽媽是好老婆,好兒媳,我堅決不會成為媽媽那樣的。” 父...
    濤聲如舊閱讀 306評論 0 0
  • 安裝docker的先決條件 Docker可以安裝在那些系統(tǒng)? 安裝方式? 查看os版本 安裝docker 將doc...
    liuchangng閱讀 608評論 0 0
  • 春天 燕子說春天是什么? 小鴨子說春天是歡樂的時間, 小鳥說春天是歡樂的笑容, 小朋友說春天是放風(fēng)箏的時間……
    悠記閱讀 232評論 0 0
  • 真正的愛里,沒有恐懼,沒有犧牲,沒有掌控,沒有輸贏。 真正的愛里,只有敞開,只有信任,只有成全,只有自由。 在流淚...
    娜娜是我閱讀 253評論 0 0

友情鏈接更多精彩內(nèi)容