R ggplot2 stat_summary

stat_summary functions are so flexible that they can save a lot of extra coding effort when they are put to good use.

After the ggplot2 main function defines the mapping, you can directly use stat_summary to plot the graph.

ggplot(.,aes(x = weight  , y = species.coverage, fill = weight))+
  # geom_boxplot(outlier.size = 1)+
  stat_summary(fun = "mean", size = 2, geom = "bar",position = position_dodge(0.75)) +
## 繪制bar,數(shù)值來(lái)源于計(jì)算后的均值
  stat_summary(fun.data = "mean_cl_boot", geom = "errorbar", width = .15,position = position_dodge(0.75))
## 添加該列值的置信區(qū)間, 計(jì)算方法是“mean_cl_boot”,假設(shè)不符合正態(tài)分布的數(shù)值向量

這些函數(shù)來(lái)源于 Hmisc
smean.cl.normal computes 3 summary variables: the sample mean and lower and upper Gaussian confidence limits based on the t-distribution.
smean.sd computes the mean and standard deviation.
smean.sdl computes the mean plus or minus a constant times the standard deviation. smean.cl.boot is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality.
These functions all delete NAs automatically.
smedian.hilow computes the sample median and a selected pair of outer quantiles having equal tail areas.

In this way, the calculation of the histogram + bootstrap + confidence interval is directly completed, which is much simpler than the constructor to calculate these things first.

If you don't use stat_summary functions, you need to use the group_by and summarise functions to calculate CI, which is troublesome.

df <- data.frame(A = rnorm(2000, mean = 15, sd = 18),
                 B = rnorm(2000, mean = 25, sd = 17)) %>% 
  pivot_longer(cols = c(A, B), names_to = "group", values_to = "time") %>% 
  mutate(time = ifelse(time < 2, abs(time) + rnorm(1,15,7), time))


my_cis <- df %>% 
  group_by(group) %>% 
  summarize(M = mean(time),
            lwr = M - sd(time) / sqrt(length(time)) * 1.96,
            upr = M + sd(time) / sqrt(length(time)) * 1.96)

df %>%
  ggplot(aes(x = group)) +
  geom_jitter(aes(y = time), width = .1, alpha = .2, color = "pink") +
  geom_errorbar(aes(ymin = lwr, ymax = upr), data = my_cis, width = .13, color = "gray25") +
  geom_point(aes(y = M), data = my_cis, shape = 18, size = 2)

當(dāng)然,你也可以從ggplot 的stat_summary 中獲取這些ci值,使用
ggplot_build(g)函數(shù)

可以訪問stat_summarywith的數(shù)據(jù)ggplot_build。

首先, ggplot 調(diào)用,存儲(chǔ)在一個(gè)對(duì)象中:

g <- ggplot(iris, aes(x = Species, y = Petal.Length)) + 
  geom_jitter(width = 0.5) + 
  stat_summary(fun.y = mean, geom = "point", color = "red") + 
  stat_summary(fun.data = mean_cl_boot, fun.args=(conf.int=0.9999), geom = "errorbar", width = 0.4)

然后,使用

ggplot_build(g)$data[[3]]

得到 mean_cl_boot:

x group y ymin ymax PANEL xmin xmax colour size linetype width alpha
1 1 1 1.462 1.386000 1.543501 1 0.8 1.2 black 0.5 1 0.4 NA
2 2 2 4.260 4.024899 4.462202 1 1.8 2.2 black 0.5 1 0.4 NA
3 3 3 5.552 5.337199 5.798202 1 2.8 3.2 black 0.5 1 0.4

ref:
r - 使用 mean_cl_boot 獲取 stat_summary 計(jì)算的值_Stack Overflow中文網(wǎng)
r - What do ggplot's stat_summary errorbars mean? - Cross Validated (stackexchange.com)
smean.sd: Compute Summary Statistics on a Vector in Hmisc: Harrell Miscellaneous (rdrr.io)

通過自定義函數(shù)在柱狀圖/箱線圖中添加均值,中位數(shù),樣本量等標(biāo)注信息

自定義函數(shù)

https://www.appsilon.com/post/ggplot2-boxplots

get_box_stats <- function(y, upper_limit = max(df$mpg) * 1.15) {
  return(data.frame(
    y = 0.95 * upper_limit,
    label = paste(
      "Count =", length(y), "\n",
      "Mean =", round(mean(y), 2), "\n",
      "Median =", round(median(y), 2), "\n"
    )
  ))
}

然后將該函數(shù)應(yīng)用于stat_summary中

ggplot(df, aes(x = cyl, y = mpg, fill = cyl)) +
  geom_boxplot() +
  scale_fill_manual(values = c("#0099f8", "#e74c3c", "#2ecc71")) +
  stat_summary(fun.data = get_box_stats, geom = "text", hjust = 0.5, vjust = 0.9) +
  theme_classic()
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容