1.GO數(shù)據(jù)庫的總體存儲(chǔ)結(jié)構(gòu)
GO數(shù)據(jù)庫的存儲(chǔ)結(jié)構(gòu)主要分為四種:termdb,assocdb,seqdb,以及full GO database ;具體區(qū)別如下:
termdb(每日更新):a database containing just the information on the GO terms and relationships.(內(nèi)容:ontologies, definitions and mappings to other dbs)
assocdb(每周更新):a database containing both the GO vocabulary and associations between GO terms and gene products. This database subsumes termdb.(內(nèi)容:all manual gene product annotations; electronic annotations (IEA) from all databases other than UniProtKB)
seqdb(每周更新):a database containing GO terms, gene products and the sequences associated with these gene products. This db subsumes the ****two above.(內(nèi)容:在包含前兩個(gè)數(shù)據(jù)庫內(nèi)容的基礎(chǔ)之上,增加了: plus protein sequences for most of the gene product)
full GO database(每月更新):termdb (above), plus manual and electronically generated (IEA) annotations.
2.不同數(shù)據(jù)庫的表(table)的設(shè)計(jì)存在包含關(guān)系
基本上seqdb的數(shù)據(jù)包含了assocdb,assocdb的數(shù)據(jù)又包含的termdb,即:
seqdb>assocdb>termdb
GO的整個(gè)數(shù)據(jù)庫中主要有14張表
繪制成思維導(dǎo)圖展示,是這樣:

3.GO數(shù)據(jù)庫提供了基于MYSQL的GO數(shù)據(jù)庫的本地下載
MYSQL數(shù)據(jù)庫的存儲(chǔ)包括8個(gè)模式(schema)
- go_associations
- go_audit
- go_general
- go_graph
- go_homology
- go_meta
- go_optimisations
- go_sequence
官網(wǎng)詳細(xì)給出了每個(gè)模式下,不同的表的設(shè)計(jì)(包括表的外鍵、主鍵等等的定義):http://geneontology.org/page/go-mysql-database-schema
- 以go_association這個(gè)schema下的go_associations.association表為例:

GO數(shù)據(jù)庫的官網(wǎng)也提供了E-R圖:

所以說GO整體數(shù)據(jù)庫的設(shè)計(jì)還是比較復(fù)雜,如果要自己開發(fā)此類數(shù)據(jù)庫,可以以此為參考