人類基因組計(jì)劃啟動(dòng)20年,目前出了很多基因組版本
2013年的GRCh38/hg38 (最新)
2009年的GRCh37/hg19 (常用)
2006年的GRCh36/hg18 (最新)
2004年的GRCh35/hg17 (常用)
為了將不同版本的染色體上的位置一一對(duì)應(yīng),UCSC出了這款工具liftOver,官方定義是
This tool converts genome coordinates and genome annotation files between assemblies.
在線版
該工具有一個(gè)在線版本Lift Genome Annotations,在頁面上選好物種(Original Genome:),轉(zhuǎn)換前版本(Original Assembly:),新物種(New Genome:)和新版本(New Assembly:),然后在輸入或者上傳bed格式文件即可。最后結(jié)果會(huì)顯示有多少數(shù)據(jù)成功轉(zhuǎn)換,多少數(shù)據(jù)沒有成功轉(zhuǎn)換。
Linux版
Linux版本非常簡(jiǎn)單,此處以hg38>hg19和hg19>hg38為例
1.安裝環(huán)境
Linux Ubuntu
2.工具下載
我下載的是linux.x86_64版本的,其他版本地址見The UCSC Genome Browser and Blat software
wget [http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/liftOver](http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/liftOver)
3.坐標(biāo)注釋文件下載
hg38Tohg19注釋文件
hg19Tohg38注釋文件
不需要解壓
如果需要其他版本的注釋文件,請(qǐng)見Sequence and Annotation Downloads,下載對(duì)應(yīng)的liftOver注釋文件即可。
4.Input文件
只接受BED格式文件,BED格式文件只定義前三列:chr start end,無表頭
注:start不等于end(如果是單位點(diǎn)的話,建議所有end+1)
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ head hg38test.bed
chr1 3900238 3900239 C1orf174
chr1 25272497 25272498 RHD
chr1 25420837 25420838 RHCE
chr1 26023102 26023103 EXTL1
chr1 45004276 45004277 HECTD3
chr1 46188924 46188925 POMGNT1
chr1 53772442 53772443 NDC1
chr1 69834837 69834838 LRRC7
chr1 88786134 88786135 PKN2
chr1 110519119 110519120 KCNA10
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ head hg19test.bed
chr1 3816802 3816803 C1orf174
chr1 25598988 25598989 RHD
chr1 25747328 25747329 RHCE
chr1 26349593 26349594 EXTL1
chr1 45469948 45469949 HECTD3
chr1 46654596 46654597 POMGNT1
chr1 54238115 54238116 NDC1
chr1 70300520 70300521 LRRC7
chr1 89251817 89251818 PKN2
chr1 111061741 111061742 KCNA10
5.坐標(biāo)轉(zhuǎn)換
簡(jiǎn)單兩個(gè)命令即可
1.將liftOver變?yōu)榭蓤?zhí)行文件
2.執(zhí)行,參數(shù)為inputfile,over.chain.gz,outputfile,unmapfile(會(huì)輸出沒有對(duì)應(yīng)上的行)
$ chmod +x ./filePath
$ ./filePath/utility_name
Example:
hg38>hg19
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ chmod +x liftOver
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ ./liftOver hg38test.bed hg38ToHg19.over.chain.gz hg38Tohg19.bed hg38Tohg19Unmap.bed
Reading liftover chains
Mapping coordinates
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ head hg38Tohg19.bed
chr1 3816802 3816803 C1orf174
chr1 25598988 25598989 RHD
chr1 25747328 25747329 RHCE
chr1 26349593 26349594 EXTL1
chr1 45469948 45469949 HECTD3
chr1 46654596 46654597 POMGNT1
chr1 54238115 54238116 NDC1
chr1 70300520 70300521 LRRC7
chr1 89251817 89251818 PKN2
chr1 111061741 111061742 KCNA10
hg19>hg38
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ chmod +x liftOver
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ ./liftOver hg19test.bed hg19ToHg38.over.chain.gz hg19Tohg38.bed hg19Tohg38Unmap.bed
Reading liftover chains
Mapping coordinates
yangyang@DESKTOP-SGNIV47:/mnt/d/Work/liftover$ head hg19Tohg38.bed
chr1 3900238 3900239 C1orf174
chr1 25272497 25272498 RHD
chr1 25420837 25420838 RHCE
chr1 26023102 26023103 EXTL1
chr1 45004276 45004277 HECTD3
chr1 46188924 46188925 POMGNT1
chr1 53772442 53772443 NDC1
chr1 69834837 69834838 LRRC7
chr1 88786134 88786135 PKN2
chr1 110519119 110519120 KCNA10