ggplot2包說明文檔學習筆記(1)
- ggplot2的繪圖數(shù)據(jù)來源是一個dataframe
- ggplot2通過一次次地添加圖層來達到繪圖的目的,每個圖層之間通過
+連接 - ggplot2并不會在乎畫出來的圖是否有實際意義,所以當使用了不正確的代碼時有可能會生成一些很奇葩的圖,但是如果有足夠的想象力,可以利用ggplot2畫出各種圖像
- 要畫出特別好看且有意義的圖,需要的不僅僅是想象力和寫代碼的能力,還需要對色彩,構(gòu)圖等的掌控,基本可以確認的是我在這方面狗屁不是
- ggplot2的基本圖形函數(shù)中幾乎都包含
stat和position兩個參數(shù),stat決定了對數(shù)據(jù)進行變換的方式,position決定了對圖形進行擺放的方式,修改這兩個參數(shù),基本上就是奇葩圖像出現(xiàn)的開始
ggplot2中的基本圖形
在ggplot2中需要通過aes()來設(shè)定x軸與y軸的數(shù)據(jù),故根據(jù)x和y的數(shù)據(jù)類型,把ggplot2中的基本圖形分為以下幾種
- 連續(xù)型x 連續(xù)型y
- 連續(xù)性x 不需要y
- 離散型x 連續(xù)型y
- 離散型x 不需要y
- 不需要x 不需要y
連續(xù)型x 連續(xù)型y
geom_point(stat = "identity", position = "identity")
- 散點圖
geom_jitter(stat = "identity", position = "jitter")
position_jitter(width = NULL, height = NULL)
- 對散點圖增加了一個擾動,
position_jitter()的返回值可以作為參數(shù)position的值,適合重復(fù)數(shù)據(jù)較多導(dǎo)致呈現(xiàn)離散化的小數(shù)據(jù)集
geom_count(stat = "sum", position = "identity")
stat_sum(geom = "point", position = "identity")
- 對每個位置點的數(shù)量計數(shù),適合離散數(shù)據(jù),當然連續(xù)型數(shù)據(jù)也能用就是沒什么用
geom_bin2d(stat = "bin2d", position = "identity")
stat_bin_2d(geom = "tile", position = "identity")
geom_hex(stat = "binhex", position = "identity")
stat_bin_hex(geom = "hex", position = "identity")
- 根據(jù)范圍內(nèi)點的數(shù)量來填充顏色,適合數(shù)據(jù)量特別大的數(shù)據(jù),
geom_bin2d為矩形范圍,geom_hex為六邊形范圍
geom_line(stat = "identity", position = "identity")
geom_path(stat = "identity", position = "identity")
geom_step(stat = "identity", position = "identity", direction = "hv")
-
geom_line將圖上的點從左至右依次連接
geom_path將圖上的點按照源數(shù)據(jù)中的順序依次連接
geom_step將圖上的點從左至右階梯型連接,參數(shù)direction決定是先水平再垂直(hv)連接還是先垂直再水平(vh)連接
geom_label(stat = "identity", position = "identity")
geom_text(stat = "identity", position = "identity")
-
geom_label添加標簽,就是在文本的后面還有一個小背景
geom_text添加文本,純文本
geom_col(position = "stack")
- 類似于柱形圖,區(qū)別在于柱形圖是對于x計數(shù),
geom_col是對于y求和
geom_tile(stat = "identity", position = "identity", width, height)
geom_rect(stat = "identity", position = "identity", aes(xmin, xmax, ymin, ymax))
geom_raster(stat = "identity", position = "identity", hjust = 0.5, vjust = 0.5)
- 三個圖形都是在點的后方繪制一個長方形,區(qū)別在于:
geom_tile使用width和height兩個參數(shù)代表繪制的長方形的長和寬,點在長方形中央
geom_rect使用aes(xmin, xmax, ymin, ymax)幾個參數(shù)來決定長方形四條邊的位置,點有可能不在長方形中央
geom_raster的長方形的長和寬為所有點中相隔最近的兩個點的水平距離和垂直距離,hjust和vjust都取1時,長方形在點的右上角。但是這個函數(shù)似乎有bug,如下:
df <- data.frame(x1 = c(1,2,4,8,16), y1 = c(1,2,4,8,16),
x2 = c(1,2,5,8,16), y2 = c(1,2,5,8,16))
ggplot(df) +
geom_raster(aes(x1, y1), fill = "blue", alpha = 0.3, hjust = 1, vjust = 1) +
geom_point(aes(x1, y1), colour = "blue", shape = 2) +
geom_raster(aes(x2, y2), fill = "red", alpha = 0.3, hjust = 1, vjust = 1) +
geom_point(aes(x2, y2), colour = "red", shape = 1)

geom_crossbar(stat = "identity", position = "identity", aes(ymax, ymin))
geom_errorbar(stat = "identity", position = "identity", aes(ymax, ymin))
geom_linerange(stat = "identity", position = "identity", aes(ymax, ymin))
geom_pointrange(stat = "identity", position = "identity", aes(ymax, ymin))
geom_errorbarh(stat = "identity", position = "identity", aes(xmax, xmin))
-
各種添加垂直間隔,對應(yīng)圖形如下圖
geom_rug(stat = "identity", position = "identity")
- 在圖的邊緣添加描述x和y的邊緣分布的小線段
geom_segment(stat = "identity", position = "identity", aes(xend, yend))
geom_curve(stat = "identity", position = "identity", aes(xend, yend))
geom_spoke(stat = "identity", position = "identity", aes(angle), radius)
-
geom_segment在點(x, y)和點(xend, yend)之間畫上線段,而geom_curve在兩個點之間畫上曲線,geom_spoke通過極坐標來添加線段,angle代表角度,radius代表半徑
geom_smooth(stat = "smooth", position = "identity")
stat_smooth(geom = "smooth", position = "identity")
- 添加擬合曲線
geom_polygon(stat = "identity", position = "identity")
- 將點通過
geom_path連接后內(nèi)部填充顏色
geom_area(stat = "identity", position = "stack")
geom_ribbon(stat = "identity", position = "identity", aes(ymin, ymax))
-
geom_ribbon在ymin和ymax之間填充顏色,geom_area其實就是ymin = 0, ymax = y的geom_ribbon
連續(xù)型x 不需要y
geom_histogram(stat = "bin", position = "stack")
geom_freqpoly(stat = "bin", position = "identity")
stat_bin(geom = "bar", position = "stack")
- 直方圖,
geom_freqpoly相當于用線來描繪直方圖的邊界
geom_density(stat = "density", position = "identity")
stat_density(geom = "area", position = "stack")
- 密度曲線,相比較于直方圖來說更加平滑
離散型x 連續(xù)型y
geom_boxplot(stat = "boxplot", position = "dodge2")
stat_boxplot(geom = "boxplot", position = "dodge2")
- 箱線圖
geom_violin(stat = "ydensity", position = "dodge")
stat_ydensity(geom = "violin", posion = "dodge")
- 小提琴圖。類似于箱線圖,不過在中央部分不用矩形而用類似于小提琴的圖形,從而能夠大致看出中央部分數(shù)據(jù)的密度曲線
離散型x 不需要y
geom_bar(stat = "count", position = "stack")
stat_count(geom = "bar", position = "stack")
- 柱形圖,對x進行計數(shù)
不需要x 不需要y
geom_abline(slope, intercept)
geom_hline(yintercept)
geom_vline(xintercept)
geom_blank()
-
geom_abline添加一條斜率為slope截距為intercept的直線,geom_hline添加一條y = yintercept的直線,geom_vline添加一條x = xintercept的直線,geom_blank什么也不干
