pivot_wider/spread 仅列中水平的子集

Question

我只想将单个列中 levels/values 的一个子集重塑为宽，但在原始列中保留选定的级别。

在此示例数据中，食品列中的 'rice' 和 'beans' 值没有“类型”特征。我想保留原始列“食物”及其级别“米饭”和“豆子”，同时将其他值调整为宽。

数据

set.seed(1)
df<-tibble(index=sample(1:5, 10, replace=TRUE),
            food=c(rep('fruit', 4),rep('meat', 4), 'rice', 'beans'),
            type=c('apple', 'apple', 'banana', 'banana', 'steak', 'steak', 't-bone', 't-bone', NA, NA))

# A tibble: 10 x 3
   index food  type  
   <int> <chr> <chr> 
 1     1 fruit apple 
 2     4 fruit apple 
 3     1 fruit banana
 4     2 fruit banana
 5     5 meat  steak 
 6     3 meat  steak 
 7     2 meat  t-bone
 8     3 meat  t-bone
 9     3 rice  NA    
10     1 beans NA

期望的输出是这样的：

output<-structure(list(index = c(1L, 1L, 4L, 2L, 5L, 3L, 3L), fruit = c("apple", 
"banana", "apple", "banana", NA, NA, NA), meat = c(NA, NA, NA, 
"t-bone", "steak", "steak", "t-bone"), food = c("beans", NA, 
NA, NA, NA, "rice", NA)), row.names = c(NA, -7L), class = c("tbl_df", 
"tbl", "data.frame"))

output
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  NA     beans
2     1 banana NA     NA   
3     4 apple  NA     NA   
4     2 banana t-bone NA   
5     5 NA     steak  NA   
6     3 NA     steak  rice 
7     3 NA     t-bone NA

我可以通过将 'rice' 和 'beans' 值移动到 'type' 列并在 'food' 中创建相应的 'food' 级别来手动完成柱子。除了费力和非系统的转换，我得到了一个意想不到的输出，具有重复的 'beans' 和 'rice' 值：

df1%>%mutate(type=coalesce(type, food),
             food=replace(food, type %in% c('rice', 'beans'), 'food'))%>%
        pivot_wider(id_cols = index, names_from = c(food), values_from = c(type))%>%
        unnest
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  NA     beans
2     1 banana NA     beans ###<-
3     4 apple  NA     NA   
4     2 banana t-bone NA   
5     5 NA     steak  NA   
6     3 NA     steak  rice 
7     3 NA     t-bone rice ###<-

我想知道是否有更简单、更安全的方法来即时使用 pivot_wider

Answer 1

你可以使用-

library(dplyr)
library(tidyr)

df %>%
  mutate(type = if_else(food %in% c('rice', 'beans'), food, type), 
         food = replace(food, food %in% c('rice', 'beans'), 'food')) %>%
  group_by(index, food) %>%
  mutate(row  = row_number()) %>%
  ungroup %>%
  pivot_wider(names_from = food, values_from = type) %>%
  select(-row)
  
#  index fruit  meat   food 
#  <int> <chr>  <chr>  <chr>
#1     1 apple  NA     beans
#2     4 apple  NA     NA   
#3     1 banana NA     NA   
#4     2 banana t-bone NA   
#5     5 NA     steak  NA   
#6     3 NA     steak  rice 
#7     3 NA     t-bone NA

Answer 2

我们可以用 coalesce 和 replace

library(dplyr)
library(tidyr)
library(data.table)
df %>% 
    mutate(type = coalesce(type, food), 
           food =  replace(food, food == type, 'food'),
            rn = rowid(index, food)) %>%
    pivot_wider(names_from = food, values_from = type) %>% 
    select(-rn)
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  <NA>   beans
2     4 apple  <NA>   <NA> 
3     1 banana <NA>   <NA> 
4     2 banana t-bone <NA> 
5     5 <NA>   steak  <NA> 
6     3 <NA>   steak  rice 
7     3 <NA>   t-bone <NA>

pivot_wider/spread 仅列中水平的子集

pivot_wider/spread only a subset of levels from a column

r

reshape

dataframe

dplyr