97干线上视频,91免费版操逼

分位數(shù)歸一化就是將多種分布映射到同一種分布?？梢灾付ㄒ粋€參考分布，將其映射到其它的統(tǒng)計分布；或者從需要處理的這些不同數(shù)據(jù)中計算出一個參考分布。

原理實現(xiàn)：

存在一個原始數(shù)組如下，每一列代表一個分布，最終我們將它們轉換為同一個分布

A    5    4    3
B    2    1    4
C    3    4    6
D    4    2    8

原始數(shù)組每列中按從小到大排序

獲得原始數(shù)組中每個元素在所屬列中按從小到大的排序之后的序號

A    iv    iii   i
B    i     i     ii
C    ii    iii   iii
D    iii   ii    iv

獲取原始數(shù)組每列中按從小到大排序之后每行的算術平均數(shù)

A (2 1 3)/3 = 2.00 = rank i
B (3 2 4)/3 = 3.00 = rank ii
C (4 4 6)/3 = 4.67 = rank iii
D (5 4 8)/3 = 5.67 = rank iv

利用第4步的對應關系，對第三部的數(shù)據(jù)進行替換

A    5.67    5.17    2.00
B    2.00    2.00    3.00
C    3.00    5.17    4.67
D    4.67    3.00    5.67

查看每列數(shù)據(jù)的統(tǒng)計情況

Min.   :2.000   Min.   :2.000   Min.   :2.000  
1st Qu.:2.750   1st Qu.:2.750   1st Qu.:2.750  
Median :3.833   Median :4.083   Median :3.833  
Mean   :3.833   Mean   :3.833   Mean   :3.833  
3rd Qu.:4.917   3rd Qu.:5.167   3rd Qu.:4.917  
Max.   :5.667   Max.   :5.167   Max.   :5.667

分位數(shù)歸一化這種方法常用于芯片測序數(shù)據(jù)，但這種方法也過于粗暴，結果往往不是很好；在芯片數(shù)據(jù)預處理時，可以使用R中preprocessCore包中打包好的分位數(shù)歸一化函數(shù)。

# 安裝

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("preprocessCore")

# 使用

library("preprocessCore")
data <- data.frame(a = sample(c(1:20),5),
                b =  sample(c(1:20),5),
                c = sample(c(1:20),5)
)
data
   a  b  c
1  7 10 11
2 20 11 16
3  6  8  7
4  8 13  9
5 17  3 14
normalize.quantiles(x=as.matrix(df))
          [,1]      [,2]      [,3]
[1,]  8.000000  9.666667  9.666667
[2,] 16.333333 14.000000 16.333333
[3,]  5.333333  8.000000  5.333333
[4,]  9.666667 16.333333  8.000000
[5,] 14.000000  5.333333 14.000000

normalize.quantiles()使用的參考分布是數(shù)據(jù)的算術平均數(shù)；需要指定參考分布，可以參考DAVE TANG'S BLOG: Quantile normalisation in R從頭寫的代碼

下面是我稍微改動了一下的代碼，添加指定指定參考分布的參數(shù)。

df <- data.frame(one=c(5,2,3,4),
                 two=c(4,1,4,4),
                 three=c(3,4,6,8)
)
quantile_normalisation <- function(df, ref_distribute){
  df_rank <- apply(df,2,rank,ties.method="min")
  df_sorted <- data.frame(apply(df, 2, sort))
  
  if(missing(ref_distribute)){
    df_mean <- apply(df_sorted, 1, mean)
  }else{
    df_mean <- ref_distribute
  }
  
  index_to_mean <- function(my_index, my_mean){
    return(my_mean[my_index])
  }
  
  df_final <- apply(df_rank, 2, index_to_mean, my_mean=df_mean)
  rownames(df_final) <- rownames(df)
  return(df_final)
}
quantile_normalisation(df)
       one      two    three
1 5.666667 3.666667 2.000000
2 2.000000 2.000000 3.666667
3 3.666667 3.666667 4.666667
4 4.666667 3.666667 5.666667

test = c(1,2,3,4)
quantile_normalisation(df, ref_distribute=test)
  one two three
1   4   2     1
2   1   1     2
3   2   2     3
4   3   2     4

參考:
Quantile normalization
DAVE TANG'S BLOG: Quantile normalisation in R

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

基因芯片數(shù)據(jù)分析-2: Quantile normalization

基因芯片數(shù)據(jù)分析-2: Quantile normalization

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

基因芯片數(shù)據(jù)分析-2: Quantile normalization

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av