前面主要介紹了
dplyr中的三大函數(shù)selectfiltermutate,這一節(jié)來(lái)介紹dplyrtidyr中執(zhí)行一些特定功能的函數(shù)
- arrange 對(duì)列進(jìn)行排序
- desc 降序排列
- distinct 去重復(fù)行
- raname 更改列名
- relocate 更改列的順序
- drop_na 刪除含有缺失值的行
- pull 提取單列
arrange() 通過(guò)選定的列進(jìn)行排序,默認(rèn)為升序
arrange(mtcars,mpg) %>% as_tibble()
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 10.4 8 472 205 2.93 5.25 18.0 0 0 3 4
2 10.4 8 460 215 3 5.42 17.8 0 0 3 4
3 13.3 8 350 245 3.73 3.84 15.4 0 0 3 4
arrange結(jié)合desc對(duì)數(shù)據(jù)進(jìn)行降序
arrange(mtcars,desc(mpg)) %>% as_tibble()
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
2 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
3 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2
distinct 針對(duì)數(shù)據(jù)框進(jìn)行去重,unique針對(duì)向量進(jìn)行去重
df <- tibble(
x = sample(10, 100, rep = TRUE),
y = sample(10, 100, rep = TRUE))
df %>% distinct()
rename 更改列名
rename(iris,petal_length=Petal.Length) %>% as_tibble()
Sepal.Length Sepal.Width petal_length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
reclocate 更改列順序
iris %>% as_tibble() %>% relocate(Species)
#下述方法也可以實(shí)現(xiàn),但是較為麻煩
iris %>% as_tibble() %>% select(Species,everything())
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
<fct> <dbl> <dbl> <dbl> <dbl>
1 setosa 5.1 3.5 1.4 0.2
2 setosa 4.9 3 1.4 0.2
3 setosa 4.7 3.2 1.3 0.2
df <- tibble(a = 1, b = 1, c = 1, d = "a", e = "a", f = "a")
a b c d e f
<dbl> <dbl> <dbl> <chr> <chr> <chr>
1 1 1 1 a a a
df %>% relocate(a, .after = c) # 指定列的順序
df %>% relocate(f, .before = b)
df %>% relocate(a, .after = last_col()) # 移至最后一列
df %>% relocate(ff = f) #更改列名
df %>% relocate(where(is.character)) # 選擇所有字符列
df %>% relocate(where(is.numeric), .after = last_col())
drop_na 刪除含有缺失值的行
df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
df
x y
<dbl> <chr>
1 1 a
2 2 NA
3 NA b
df %>% drop_na()
> df %>% drop_na()
# A tibble: 1 x 2
x y
<dbl> <chr>
1 1 a
df %>% drop_na(x)
x y
<dbl> <chr>
1 1 a
2 2 NA
pull 提取單列
pull( )與$相似,在管道中使用pull更加優(yōu)雅
iris %>% as_tibble() %>%
mutate(mean = rowMeans(across(where(is.numeric)))) %>%
pull(mean)
不使用pull函數(shù)稱(chēng)為點(diǎn)過(guò)濾
iris %>% as_tibble() %>%
mutate(mean = rowMeans(across(where(is.numeric)))) %>%
.$mean
喜歡的小伙伴歡迎關(guān)注我的公眾號(hào)
R語(yǔ)言數(shù)據(jù)分析指南,持續(xù)分享數(shù)據(jù)可視化的經(jīng)典案例及一些生信知識(shí),希望對(duì)大家有所幫助