如何基于 R 中的多列展开单列?
How to spread a single column based on multiple columns in R?
每个独特的年份、地点、象限和物种在数据集中都有两个值“Val”。我想将值分散到两列“Val1”和“Val2”中。我尝试使用常规传播函数,但它似乎不合适。有什么建议吗?
Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22
期望的输出
Year Site Quadrant Species Val1 Val2
2019 1 1 A 20 30
2019 1 1 B 20 25
2019 1 2 A 20 10
2019 1 2 B 11 22
您可以这样做:使用 lead
library(tidyverse)
df %>%
mutate(id = row_number(),
Val2 = lead(Val)) %>%
filter(id %% 2 == 1) %>%
select(-id, Val1 = Val)
输出:
Year Site Quadrant Species Val1 Val2
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
数据:
df <- tribble(
~Year, ~Site, ~Quadrant, ~Species, ~Val,
2019, 1, 1, "A", 20,
2019, 1, 1, "A", 30,
2019, 1, 1, "B", 20,
2019, 1, 1, "B", 25,
2019, 1, 2, "A", 20,
2019, 1, 2, "A", 10,
2019, 1, 2, "B", 11,
2019, 1, 2, "B", 22)
使用 data.table::dcast
和 rowid
:
library(data.table)
dcast(dtt,
Year + Site + Quadrant + Species ~ rowid(Year, Site, Quadrant, Species),
value.var = 'Val')
# Year Site Quadrant Species 1 2
# 1: 2019 1 1 A 20 30
# 2: 2019 1 1 B 20 25
# 3: 2019 1 2 A 20 10
# 4: 2019 1 2 B 11 22
如果您愿意,可以用 tidyverse 的方式完成类似的操作:
dtt %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(grp = row_number()) %>%
pivot_wider(names_from = grp, values_from = Val, names_prefix = 'Val') %>%
ungroup()
# A tibble: 4 x 6
# Year Site Quadrant Species Val1 Val2
# <int> <int> <int> <chr> <int> <int>
# 1 2019 1 1 A 20 30
# 2 2019 1 1 B 20 25
# 3 2019 1 2 A 20 10
# 4 2019 1 2 B 11 22
您可以group_by
列,mutate
创建新列headers,然后spread
(或pivot_wider
):
library(dplyr)
mydata %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(Var = paste0("Val", row_number())) %>%
spread(Var, Val) %>%
ungroup()
结果:
# A tibble: 4 x 6
Year Site Quadrant Species Val1 Val2
<int> <int> <int> <chr> <int> <int>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
数据:
mydata <- read.table(text = "Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22", header = TRUE)
每个独特的年份、地点、象限和物种在数据集中都有两个值“Val”。我想将值分散到两列“Val1”和“Val2”中。我尝试使用常规传播函数,但它似乎不合适。有什么建议吗?
Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22
期望的输出
Year Site Quadrant Species Val1 Val2
2019 1 1 A 20 30
2019 1 1 B 20 25
2019 1 2 A 20 10
2019 1 2 B 11 22
您可以这样做:使用 lead
library(tidyverse)
df %>%
mutate(id = row_number(),
Val2 = lead(Val)) %>%
filter(id %% 2 == 1) %>%
select(-id, Val1 = Val)
输出:
Year Site Quadrant Species Val1 Val2
<dbl> <dbl> <dbl> <chr> <dbl> <dbl>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
数据:
df <- tribble(
~Year, ~Site, ~Quadrant, ~Species, ~Val,
2019, 1, 1, "A", 20,
2019, 1, 1, "A", 30,
2019, 1, 1, "B", 20,
2019, 1, 1, "B", 25,
2019, 1, 2, "A", 20,
2019, 1, 2, "A", 10,
2019, 1, 2, "B", 11,
2019, 1, 2, "B", 22)
使用 data.table::dcast
和 rowid
:
library(data.table)
dcast(dtt,
Year + Site + Quadrant + Species ~ rowid(Year, Site, Quadrant, Species),
value.var = 'Val')
# Year Site Quadrant Species 1 2
# 1: 2019 1 1 A 20 30
# 2: 2019 1 1 B 20 25
# 3: 2019 1 2 A 20 10
# 4: 2019 1 2 B 11 22
如果您愿意,可以用 tidyverse 的方式完成类似的操作:
dtt %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(grp = row_number()) %>%
pivot_wider(names_from = grp, values_from = Val, names_prefix = 'Val') %>%
ungroup()
# A tibble: 4 x 6
# Year Site Quadrant Species Val1 Val2
# <int> <int> <int> <chr> <int> <int>
# 1 2019 1 1 A 20 30
# 2 2019 1 1 B 20 25
# 3 2019 1 2 A 20 10
# 4 2019 1 2 B 11 22
您可以group_by
列,mutate
创建新列headers,然后spread
(或pivot_wider
):
library(dplyr)
mydata %>%
group_by(Year, Site, Quadrant, Species) %>%
mutate(Var = paste0("Val", row_number())) %>%
spread(Var, Val) %>%
ungroup()
结果:
# A tibble: 4 x 6
Year Site Quadrant Species Val1 Val2
<int> <int> <int> <chr> <int> <int>
1 2019 1 1 A 20 30
2 2019 1 1 B 20 25
3 2019 1 2 A 20 10
4 2019 1 2 B 11 22
数据:
mydata <- read.table(text = "Year Site Quadrant Species Val
2019 1 1 A 20
2019 1 1 A 30
2019 1 1 B 20
2019 1 1 B 25
2019 1 2 A 20
2019 1 2 A 10
2019 1 2 B 11
2019 1 2 B 22", header = TRUE)