首先在計(jì)算Froh所需要的文件是利用detectRUNS包計(jì)算得到的ROH的長度分布數(shù)據(jù)
library(detectRUNS)
genotypeFilePath <- ("immigrant_out2.ped")
mapFilePath <- ("immigrant_out2.map")
slidingRuns <- slidingRUNS.run(genotypeFile = "immigrant_out2.ped",mapFile = "immigrant_out2.map", windowSize = 50, threshold = 0.05, minSNP = 50, ROHet = FALSE, maxOppWindow = 3, maxMissWindow = 5, maxGap = 10^5, minLengthBps = 500000, minDensity = 1/50)
summaryList <- summaryRuns( runs = slidingRuns, mapFile = mapFilePath, genotypeFile = genotypeFilePath, Class = 6, snpInRuns = TRUE)
write.table(slidingRuns,file = "./immigrant_roh.csv",sep = '\t', row.names = F, quote = F)
write.table(summaryList$result_Froh_genome_wide, file = "./immigrant_Froh.csv", sep = '\t', row.names = F, quote = F)
然后我想篩選出那些長度較長的ROH片段,單獨(dú)計(jì)算Froh
由于個(gè)體數(shù)較多,所以想利用python腳本生成命令集,進(jìn)行批量處理
### 篩選符合長度的ROH
awk '$7 > 1000000 && $7 < 1500000' offspring_roh.csv > offspring_roh1.txt
### 利用Python腳本生成命令集
#!/usr/bin/env python
inputfile = "off_ind.txt"
outputfile = "mean.sh"
output = open(outputfile,"w")
for line in open(inputfile):
output.write('grep -i ' + '"' + '^' + line.rstrip() + '" ' + 'offspring_roh1.txt ' + '> ' + line.rstrip() + '.txt' + '\n' + 'awk ' + '\'' + 'BEGIN{sum = 0} {sum += $7} END {print ' + '"' + line.rstrip() + ' "' + 'sum/2212050392}' + '\' ' + line.rstrip() + '.txt' + ' >> ' + 'mean.txt' + '\n')
output.close()
### 生成的命令集合如下
grep -i "^ERR1990133" offspring_roh1.txt > ERR1990133.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990133 "sum/2212050392}' ERR1990133.txt > mean.txt
grep -i "^ERR1990134" offspring_roh1.txt > ERR1990134.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990134 "sum/2212050392}' ERR1990134.txt >> mean.txt
grep -i "^ERR1990136" offspring_roh1.txt > ERR1990136.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990136 "sum/2212050392}' ERR1990136.txt >> mean.txt
grep -i "^ERR1990139" offspring_roh1.txt > ERR1990139.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990139 "sum/2212050392}' ERR1990139.txt >> mean.txt
grep -i "^ERR1990140" offspring_roh1.txt > ERR1990140.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990140 "sum/2212050392}' ERR1990140.txt >> mean.txt
grep -i "^ERR1990144" offspring_roh1.txt > ERR1990144.txt
awk 'BEGIN{sum = 0} {sum += $7} END {print "ERR1990144 "sum/2212050392}' ERR1990144.txt >> mean.txt
### 運(yùn)行sh
sh mean.sh
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。