pivot_wider/spread 仅列中水平的子集
pivot_wider/spread only a subset of levels from a column
我只想将单个列中 levels/values 的一个子集重塑为宽,但在原始列中保留选定的级别。
在此示例数据中,食品列中的 'rice' 和 'beans' 值没有“类型”特征。
我想保留原始列“食物”及其级别“米饭”和“豆子”,同时将其他值调整为宽。
数据
set.seed(1)
df<-tibble(index=sample(1:5, 10, replace=TRUE),
food=c(rep('fruit', 4),rep('meat', 4), 'rice', 'beans'),
type=c('apple', 'apple', 'banana', 'banana', 'steak', 'steak', 't-bone', 't-bone', NA, NA))
# A tibble: 10 x 3
index food type
<int> <chr> <chr>
1 1 fruit apple
2 4 fruit apple
3 1 fruit banana
4 2 fruit banana
5 5 meat steak
6 3 meat steak
7 2 meat t-bone
8 3 meat t-bone
9 3 rice NA
10 1 beans NA
期望的输出是这样的:
output<-structure(list(index = c(1L, 1L, 4L, 2L, 5L, 3L, 3L), fruit = c("apple",
"banana", "apple", "banana", NA, NA, NA), meat = c(NA, NA, NA,
"t-bone", "steak", "steak", "t-bone"), food = c("beans", NA,
NA, NA, NA, "rice", NA)), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"))
output
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple NA beans
2 1 banana NA NA
3 4 apple NA NA
4 2 banana t-bone NA
5 5 NA steak NA
6 3 NA steak rice
7 3 NA t-bone NA
我可以通过将 'rice' 和 'beans' 值移动到 'type' 列并在 'food' 中创建相应的 'food' 级别来手动完成柱子。除了费力和非系统的转换,我得到了一个意想不到的输出,具有重复的 'beans' 和 'rice' 值:
df1%>%mutate(type=coalesce(type, food),
food=replace(food, type %in% c('rice', 'beans'), 'food'))%>%
pivot_wider(id_cols = index, names_from = c(food), values_from = c(type))%>%
unnest
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple NA beans
2 1 banana NA beans ###<-
3 4 apple NA NA
4 2 banana t-bone NA
5 5 NA steak NA
6 3 NA steak rice
7 3 NA t-bone rice ###<-
我想知道是否有更简单、更安全的方法来即时使用 pivot_wider
你可以使用-
library(dplyr)
library(tidyr)
df %>%
mutate(type = if_else(food %in% c('rice', 'beans'), food, type),
food = replace(food, food %in% c('rice', 'beans'), 'food')) %>%
group_by(index, food) %>%
mutate(row = row_number()) %>%
ungroup %>%
pivot_wider(names_from = food, values_from = type) %>%
select(-row)
# index fruit meat food
# <int> <chr> <chr> <chr>
#1 1 apple NA beans
#2 4 apple NA NA
#3 1 banana NA NA
#4 2 banana t-bone NA
#5 5 NA steak NA
#6 3 NA steak rice
#7 3 NA t-bone NA
我们可以用 coalesce
和 replace
library(dplyr)
library(tidyr)
library(data.table)
df %>%
mutate(type = coalesce(type, food),
food = replace(food, food == type, 'food'),
rn = rowid(index, food)) %>%
pivot_wider(names_from = food, values_from = type) %>%
select(-rn)
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple <NA> beans
2 4 apple <NA> <NA>
3 1 banana <NA> <NA>
4 2 banana t-bone <NA>
5 5 <NA> steak <NA>
6 3 <NA> steak rice
7 3 <NA> t-bone <NA>
我只想将单个列中 levels/values 的一个子集重塑为宽,但在原始列中保留选定的级别。
在此示例数据中,食品列中的 'rice' 和 'beans' 值没有“类型”特征。 我想保留原始列“食物”及其级别“米饭”和“豆子”,同时将其他值调整为宽。
数据
set.seed(1)
df<-tibble(index=sample(1:5, 10, replace=TRUE),
food=c(rep('fruit', 4),rep('meat', 4), 'rice', 'beans'),
type=c('apple', 'apple', 'banana', 'banana', 'steak', 'steak', 't-bone', 't-bone', NA, NA))
# A tibble: 10 x 3
index food type
<int> <chr> <chr>
1 1 fruit apple
2 4 fruit apple
3 1 fruit banana
4 2 fruit banana
5 5 meat steak
6 3 meat steak
7 2 meat t-bone
8 3 meat t-bone
9 3 rice NA
10 1 beans NA
期望的输出是这样的:
output<-structure(list(index = c(1L, 1L, 4L, 2L, 5L, 3L, 3L), fruit = c("apple",
"banana", "apple", "banana", NA, NA, NA), meat = c(NA, NA, NA,
"t-bone", "steak", "steak", "t-bone"), food = c("beans", NA,
NA, NA, NA, "rice", NA)), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"))
output
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple NA beans
2 1 banana NA NA
3 4 apple NA NA
4 2 banana t-bone NA
5 5 NA steak NA
6 3 NA steak rice
7 3 NA t-bone NA
我可以通过将 'rice' 和 'beans' 值移动到 'type' 列并在 'food' 中创建相应的 'food' 级别来手动完成柱子。除了费力和非系统的转换,我得到了一个意想不到的输出,具有重复的 'beans' 和 'rice' 值:
df1%>%mutate(type=coalesce(type, food),
food=replace(food, type %in% c('rice', 'beans'), 'food'))%>%
pivot_wider(id_cols = index, names_from = c(food), values_from = c(type))%>%
unnest
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple NA beans
2 1 banana NA beans ###<-
3 4 apple NA NA
4 2 banana t-bone NA
5 5 NA steak NA
6 3 NA steak rice
7 3 NA t-bone rice ###<-
我想知道是否有更简单、更安全的方法来即时使用 pivot_wider
你可以使用-
library(dplyr)
library(tidyr)
df %>%
mutate(type = if_else(food %in% c('rice', 'beans'), food, type),
food = replace(food, food %in% c('rice', 'beans'), 'food')) %>%
group_by(index, food) %>%
mutate(row = row_number()) %>%
ungroup %>%
pivot_wider(names_from = food, values_from = type) %>%
select(-row)
# index fruit meat food
# <int> <chr> <chr> <chr>
#1 1 apple NA beans
#2 4 apple NA NA
#3 1 banana NA NA
#4 2 banana t-bone NA
#5 5 NA steak NA
#6 3 NA steak rice
#7 3 NA t-bone NA
我们可以用 coalesce
和 replace
library(dplyr)
library(tidyr)
library(data.table)
df %>%
mutate(type = coalesce(type, food),
food = replace(food, food == type, 'food'),
rn = rowid(index, food)) %>%
pivot_wider(names_from = food, values_from = type) %>%
select(-rn)
# A tibble: 7 x 4
index fruit meat food
<int> <chr> <chr> <chr>
1 1 apple <NA> beans
2 4 apple <NA> <NA>
3 1 banana <NA> <NA>
4 2 banana t-bone <NA>
5 5 <NA> steak <NA>
6 3 <NA> steak rice
7 3 <NA> t-bone <NA>