對(duì)于更新版的Affymetrix芯片,如Affymetrix Human Gene 1.1 ST Array,affy就顯得無(wú)能為力了。這時(shí)候oligo包就粉末登場(chǎng)了。 下面以GSE48452為例
核心函數(shù):
- read.celfiles()
- oligo::rma()
1. 數(shù)據(jù)下載及讀取
rm(list = ls())
library(GEOquery)
library(oligo)
library(affy)
gse="GSE43332"
#rawdata <- getGEOSuppFiles(gse) #下載原始數(shù)據(jù)
#setwd(gse)
gset <- getGEO(gse,getGPL = F,destdir = ".")
untar("GSE43332_RAW.tar",exdir = ".")
celfiles <- list.files(pattern = "*CEL.gz$") #批量查找并列出后綴為.gz的文件
#sapply(celfiles, gunzip) #批量解壓
data.raw <- read.celfiles(celfiles)
## Reading in : GSM1060627_1A_ML_Cindy_9-8-10_2.CEL.gz
## Reading in : GSM1060628_1B_ML_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060629_2A_N_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060630_2B_N_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060631_3A_NRa_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060632_3B_NRa_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060633_5A_DU145_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060634_5B_DU145_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060635_6A_DU145Ra_Cindy_09-08-10.CEL.gz
## Reading in : GSM1060636_6B_DU145Ra_Cindy_9-8-10.CEL.gz
## Reading in : GSM1060637_1Clone1-P09_Cindy_8-30-11_HuGene-1_0-st-v1_.CEL.gz
## Reading in : GSM1060638_2Clone1-P10_Cindy_8-30-11_HuGene-1_0-st-v1_.CEL.gz
## Reading in : GSM1060639_3Clone3-P05_Cindy_8-30-11_HuGene-1_0-st-v1_.CEL.gz
## Reading in : GSM1060640_4Clone3-P06_Cindy_8-30-11_HuGene-1_0-st-v1_.CEL.gz
sampleNames(data.raw) <- sapply(strsplit(sampleNames(data.raw),"_",fixed=T), "[",1)
#group
pd <- pData(gset[[1]])
pd <- pd %>%
select(geo_accession,'bone metastatic potential:ch1')
colnames(pd) <- c("id","type")
pd$type <- case_when(pd$type=="non-bone metastatic"~"nbm",
T~"bm")
pd$type <- factor(pd$type,levels = c("nbm","bm"))
pd <- pd[order(pd$type),]
group_list <- pd$type
names(group_list) <- pd$id
2.數(shù)據(jù)預(yù)處理
oligo讀取的CEL文件質(zhì)量控制時(shí),也需要數(shù)據(jù)擬合fitProbeLevelModel,類(lèi)似affyPLM的功能。
fit1 <- fitProbeLevelModel(data.raw)
image(fit1,type="weights",which=1,main="weights")

image(fit1,type="residuals",which=1,main="Residuals")

image(fit1,type="sign.residuals",which=1,main="Residuals.sign")

data.eset <- oligo::rma(data.raw)
## Background correcting
## Normalizing
## Calculating Expression
# featureData(data.eset) <- getNetAffx(gene.eset, "probeset") #有時(shí)需自己把ID賦值給表達(dá)矩陣
data.exprs <- exprs(data.eset)
library(RColorBrewer)
colors <- brewer.pal(12,"Set2")
boxplot(data.exprs,col=colors,main="afterRMA")

有一點(diǎn)很疑惑,為什么oligoPLM對(duì)象不能計(jì)算RLE、NUSE和RNA降解情況?
其余的與affy處理類(lèi)似。
參考鏈接: