Dispite this popularity and widespread use, many aspects of the Gene Ontology remain poorly understood, at times even by experts. For instance, unbeknownst to most users, routine procedures such as GO term enrichment analyses remain subject to biases and simplifying assumptions that can lead to suprious ocnclusions. -----Christobe Dessimoz and Nives Skunca
一、本體論的概念
本體論是一個以結構化的可計算的格式來描述某個感興趣領域里實體和其關系的結構(Ontologies are computatonal structures that describe the entities and relationships of a domain of interest in a structured computable format, which allows for their use in multiple applications),所以其核心在于實體(分類)的集合和相應的關系。那么基因本體論的核心就在于基因研究這個領域的核心實體和對應的關系。
二、本體論中的元素
1、classes(類別)
class是本體論的基本單元。它代表了某個特定領域里的一類東西,通常與本體論中命名空間里唯一的標識符關聯(lián)。如GO:0006423。為了保證其穩(wěn)定性,這類標識符不包含類別名稱或定義的參考(they do not contain a reference to the class name or definition)。應該就是說GO號并沒有特定的含義,只是字母和數(shù)字的組合,只有當它們與元數(shù)據(jù)關聯(lián)時才有明確的含義,所以GO號永遠可以不變,只需要變動元數(shù)據(jù)就可以了。
2、metadata(元數(shù)據(jù))
元數(shù)據(jù)就是解釋性的文本數(shù)據(jù),其實就是與其關聯(lián)的GO號的定義,該定義應該非常明確,而且能提供足夠的信息(sufficiently distinguishing different classes in an ontology so that a user can determine which is the best to use for annotation)。每個GO號可以對于多個元數(shù)據(jù)。
3、relations(關系)
類別通常以層級的關系排列,類似與數(shù)據(jù)結構中的樹,但又不是簡單的樹,是一種有向非循環(huán)圖,該圖應該避免循環(huán)。通常常用的關系類型有:
| relationship type | informal meaning | examples |
|---|---|---|
| part_of | 部分關系 | a brain is part_of a body |
| derives_from | 來源于兩個完全不同的實體,有短暫的繼承關系 | a zygote derives_from a sperm and an ovum |
| has_participant | 連接實體的關系 | an apoptotic process has_participant a cell |
| has_function | 連接物質實體和功能的關系 | an enzyme has_function to catalyse a specific reaction type |
4、formats(格式)
目前GO的儲存格式是human-readable Open Biomedical Ontologies(OBO),還有一種新發(fā)展出來的Semantic Web standard Web Ontology Language(OWL)
5、axioms(公理)
也就是內含的邏輯含義。比如羧酸是一種碳酮酸,隱含的邏輯就是任意的羧酸都是碳酮酸。OWL語言中的邏輯構架(直接從書中復制):
| language component | informal meaning | examples |
|---|---|---|
| quantification:universal(only) or existential(some) | universal quantification: all relationships of that type the target has to belong to the specified class; existential quantification:at least one member of the target class must participate in a relationship of that type | molecule has_part some atom |
| cardinality:exact, minimum or maximum | It is possible to specify the number of relationships with a given type and target that a class must participate in, or a minimum or maximum number thereof | human has_part exactly 2 leg |
| logical connectives:intersectin or union | It is possible to build complex expressions by joining together parts using the standard logical connectives and, or. | vitamin B equivalentTo |
| Negation(not) | in addition to building complex expressions using the logical connectives, it is possible to compose negations | tailless equivalentTo not |
| Disjointness of classes | It is possible to specify that classes should not share any members | organic disjointFrom inorganic |
| equivalence of classes | It is possible to specify that two classes--or class expressions-- are logically equivalent, and that they must by definition thus share all their member | melanoma equivalentTo skin cancer |
三、缺點
本體論是基于邏輯的,它可以很好地表示Ture or False,但不能表示模糊的,在一定條件下的概念。同時,它也很難表示動態(tài)的信息。
參考文獻
1、geneontology.org/docs/ontology-documentation/
2、The Gene Ontology Handbook