R dplyr pivot wider with duplicates 并生成变量名

R dplyr pivot wider with duplicates and generate variable names

我怎样才能从

df<-data.frame(id=c("A", "B", "B"), res=c("one", "two", "three"))
df

df.output<-data.frame(id=c("A", "B"), res1=c("one", "two"), res2=c(NA, "three"))
df.output

dplyr?

我不知道先验id中的重复次数(在这个例子中B有2次出现),所以[= =15=] 输出数据框中的变量必须即时生成。

您只需创建一个行标识符,您可以使用 dplyr 完成此操作,然后使用 tidyr::pivot_wider() 生成所有 resX 变量。

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  mutate(
    no = row_number()
  ) %>%
  ungroup() %>%
  pivot_wider(
    id,
    names_from = no,
    names_prefix = "res",
    values_from = res
  )
#> # A tibble: 2 × 3
#>   id    res1  res2 
#>   <chr> <chr> <chr>
#> 1 A     one   <NA> 
#> 2 B     two   three

使用data.table::dcast:

library(data.table)
dcast(setDT(df), id ~ rowid(id, prefix = "res"), value.var = "res")

   id res1  res2
1:  A  one  <NA>
2:  B  two three

基础 R 选项 reshape + ave

reshape(
  transform(df, q = ave(id, id, FUN = seq_along)),
  direction = "wide",
  idvar = "id",
  timevar = "q"
)

给予

  id res.1 res.2
1  A   one  <NA>
2  B   two three

pivot_wider 的方法。在我们处理数据之前:

library(dplyr)
library(tidyr)

df %>% 
  group_by(id) %>% 
  mutate(names = paste0("res", row_number())) %>% 
  pivot_wider(
    names_from = names,
    values_from = res,
  )
  id    res1  res2 
  <chr> <chr> <chr>
1 A     one   NA   
2 B     two   three