數(shù)據(jù)分析:方差分析的R實(shí)現(xiàn)

介紹

方差分析的R語言實(shí)現(xiàn)包括以下部分:

  • 數(shù)據(jù)導(dǎo)入

  • 數(shù)據(jù)清洗

  • ANOVA計(jì)算

  • 結(jié)果解析

  • ANOVA評估

參考教程Analysis_of_Variance

knitr::opts_chunk$set(message = FALSE, warning = FALSE)

options(stringsAsFactors = F)
options(future.globals.maxSize = 1000 * 1024^2)

step1: 安裝R包

install.packages(c("ggplot2", "ggpubr", "tidyverse"))

step2: 載R包

library(tidyverse) # 數(shù)據(jù)預(yù)處理R包
library(readxl) # 讀取xlsx數(shù)據(jù)R包
library(ggpubr) # 畫圖R包

step3: 導(dǎo)入數(shù)據(jù)

  • 隨機(jī)生成數(shù)據(jù)
data <- data.frame(D = c(rep("A", 4), rep("B", 4), rep("C", 4), rep("D", 4), rep("E", 4), rep("F", 4)),
                   RR = c(80,83,83,85,75,75,79,79,74,73,76,77,67,72,74,74,62,62,67,69,60,61,64,66))
  • 存儲數(shù)據(jù)
write.table(data, file = "data.txt", sep = "\t", quote = F, row.names = F)
xlsx::write.xlsx(data, file = "data.xlsx", row.names = F)
  • txt數(shù)據(jù)格式
data <- read.table("data.txt", header = T)
  • xlsx數(shù)據(jù)格式
data <- read_xlsx("data.xlsx", sheet = 1)

step4: 數(shù)據(jù)清洗

  • 篩選數(shù)據(jù):丟棄A組數(shù)據(jù)
data_drop <- data %>%
  dplyr::filter(D != "A")#%>%
  #dplyr::mutate(Test = "test")

head(data_drop)
  • 數(shù)據(jù)平均值和其他指標(biāo)
data %>%
  group_by(D) %>%
  summarise(N=n(),
            Means=mean(RR),
            SS=sum((RR - Means)^2),
            SD=sd(RR),
            SEM=SD/N^.5)
  • 展示數(shù)據(jù): boxplot
ggboxplot(data_drop, 
          x = "D", 
          y = "RR", 
          color = "D",
          ylab = "RR", xlab = "D")

step5: 單因素方差分析

  • one-way ANOVAs: 使用aov函數(shù)運(yùn)行單因素方差分析 (公式是:Y是檢驗(yàn)變量,X是分組變量);

  • 再使用summary函數(shù)獲取單因素方差分析的結(jié)果。

# Y=RR; X=D
one.way <- aov(RR ~ D, data = data_drop)

summary(one.way)

結(jié)果解析:

  • Residuals是模型的殘差,可以理解為截距;

  • Df列顯示了自變量的自由度(變量中的水平數(shù)減1)和殘差的自由度(觀察總數(shù)減1和自變量中的水平數(shù)減1);

  • Sum Sq列顯示平方和(即組均值與總體均值之間的總變化)。;

  • Mean Sq列是平方和的平均值,通過將平方和除以每個(gè)參數(shù)的自由度來計(jì)算;

  • F value列是F檢驗(yàn)的檢驗(yàn)統(tǒng)計(jì)量。這是每個(gè)自變量的均方除以殘差的均方。F值越大,自變量引起的變化越有可能是真實(shí)的,而不是偶然的;

  • Pr(>F)列是F統(tǒng)計(jì)量的p值。這表明,如果組均值之間沒有差異的原假設(shè)成立,那么從檢驗(yàn)中計(jì)算出的F值發(fā)生的概率大小。

  • 另一種方法:t-test僅僅適合2組比較,因此需要篩選
data_ttest <- data_drop %>%
  dplyr::filter(D %in% c("B", "C")) #%>%
  #dplyr::filter(RR != 77)

# data_test_filter <- filter(data_drop, D %in% c("B"))
# data_test_filter2 <- filter(data_test_filter, RR != 77)

t.test(RR ~ D, data = data_ttest)

step6: 后置檢驗(yàn)

  • ANOVA結(jié)果僅僅揭示多個(gè)組間的差異結(jié)果,具體到哪兩個(gè)組內(nèi)部差異還需要做后置檢驗(yàn)

  • 后置檢驗(yàn)通常采用TukeyHD函數(shù)

TukeyHSD(one.way)
  • 該結(jié)果給出每個(gè)兩組之間的結(jié)果;

  • diff: 兩組的均值之差;

  • Lwr, upr: 95%置信區(qū)間的下限和上限(默認(rèn)值) ;

  • P adj: 多次比較調(diào)整后的P值。

step7: 檢查殘差分布是否符合正態(tài)分布

  • ANOVA比較的是均值,需要每個(gè)分組的殘差服從正態(tài)部分
plot(one.way, 2)
  • 采用Shapiro-Wilk對殘差進(jìn)行檢驗(yàn)
shapiro.test(x = residuals(object = one.way))

結(jié)果顯示:殘差不顯著也即是表明殘差服從正態(tài)分布,可以采用ANOVA分析方法判斷RR在D分組的分布水平。

step8: 方差齊性檢驗(yàn)

library(car)

leveneTest(RR ~ D, data = data_drop, center = mean) 
bartlett.test(RR ~ D, data = data_drop) 

系統(tǒng)信息

devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Monterey 12.2.1
 system   x86_64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Asia/Shanghai
 date     2024-03-12
 rstudio  2023.09.0+463 Desert Sunflower (desktop)
 pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 abind         1.4-5   2016-07-21 [1] CRAN (R 4.3.0)
 backports     1.4.1   2021-12-13 [1] CRAN (R 4.3.0)
 broom         1.0.5   2023-06-09 [1] CRAN (R 4.3.0)
 bslib         0.6.1   2023-11-28 [1] CRAN (R 4.3.0)
 cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.0)
 car           3.1-2   2023-03-30 [1] CRAN (R 4.3.0)
 carData       3.0-5   2022-01-06 [1] CRAN (R 4.3.0)
 cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.3.0)
 cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.0)
 colorspace    2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
 devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.0)
 digest        0.6.34  2024-01-11 [1] CRAN (R 4.3.0)
 dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.3.0)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
 evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.0)
 fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.0)
 farver        2.1.1   2022-07-06 [1] CRAN (R 4.3.0)
 fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
 forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.0)
 fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.0)
 generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
 ggplot2     * 3.4.4   2023-10-12 [1] CRAN (R 4.3.0)
 ggpubr      * 0.6.0   2023-02-10 [1] CRAN (R 4.3.0)
 ggsignif      0.6.4   2022-10-13 [1] CRAN (R 4.3.0)
 glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.0)
 gtable        0.3.4   2023-08-21 [1] CRAN (R 4.3.0)
 hms           1.1.3   2023-03-21 [1] CRAN (R 4.3.0)
 htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.0)
 htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.3.0)
 httpuv        1.6.14  2024-01-26 [1] CRAN (R 4.3.2)
 jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.0)
 jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.0)
 knitr         1.45    2023-10-30 [1] CRAN (R 4.3.0)
 labeling      0.4.3   2023-08-29 [1] CRAN (R 4.3.0)
 later         1.3.2   2023-12-06 [1] CRAN (R 4.3.0)
 lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.0)
 lubridate   * 1.9.3   2023-09-27 [1] CRAN (R 4.3.0)
 magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
 memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.0)
 mime          0.12    2021-09-28 [1] CRAN (R 4.3.0)
 miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.0)
 munsell       0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
 pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
 pkgbuild      1.4.3   2023-12-10 [1] CRAN (R 4.3.0)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
 pkgload       1.3.4   2024-01-16 [1] CRAN (R 4.3.0)
 profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.0)
 promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.0)
 purrr       * 1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
 Rcpp          1.0.12  2024-01-09 [1] CRAN (R 4.3.0)
 readr       * 2.1.5   2024-01-10 [1] CRAN (R 4.3.0)
 readxl      * 1.4.3   2023-07-06 [1] CRAN (R 4.3.0)
 remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.3.0)
 rJava         1.0-6   2021-12-10 [1] CRAN (R 4.3.0)
 rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.0)
 rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.0)
 rstatix       0.7.2   2023-02-01 [1] CRAN (R 4.3.0)
 rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
 sass          0.4.8   2023-12-06 [1] CRAN (R 4.3.0)
 scales        1.3.0   2023-11-28 [1] CRAN (R 4.3.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
 shiny         1.8.0   2023-11-17 [1] CRAN (R 4.3.0)
 stringi       1.8.3   2023-12-11 [1] CRAN (R 4.3.0)
 stringr     * 1.5.1   2023-11-14 [1] CRAN (R 4.3.0)
 tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
 tidyr       * 1.3.1   2024-01-24 [1] CRAN (R 4.3.2)
 tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
 tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.0)
 timechange    0.3.0   2024-01-18 [1] CRAN (R 4.3.0)
 tzdb          0.4.0   2023-05-12 [1] CRAN (R 4.3.0)
 urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.0)
 usethis       2.2.2   2023-07-06 [1] CRAN (R 4.3.0)
 utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.0)
 vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.0)
 withr         3.0.0   2024-01-16 [1] CRAN (R 4.3.0)
 xfun          0.41    2023-11-01 [1] CRAN (R 4.3.0)
 xlsx          0.6.5   2020-11-10 [1] CRAN (R 4.3.0)
 xlsxjars      0.6.1   2014-08-22 [1] CRAN (R 4.3.0)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.0)
 yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.0)

 [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

參考材料

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容