在没有“timevar”和多值列的情况下重塑数据框,从长格式到宽格式

Reshape dataframe without “timevar” and multiple value columns from long to wide format

我偶然发现了我想将两列转换为一行的问题,我用一个键来做到这一点。我有一个 table,它由键、活动和 activity 的相应间隔组成。

set.seed(2)
(data <- data.frame(key=rep(LETTERS, each=4)[1:8], 
                   acitity=c("watering", "remove weeds", "cut", "remove leaf", "watering", "remove weeds", "cut", "fertilize"), 
                   intervall= sample(1:8)))
#  key      acitity intervall
#1   A     watering         2
#2   A remove weeds         3
#3   A          cut         1
#4   A  remove leaf         6
#5   B     watering         4
#6   B remove weeds         7
#7   B          cut         8
#8   B    fertilize         5

我的目标是为每个键获取一行,其中依次写入活动和间隔。

输出:

key activity    intervall   acticity_1    intervall_1   acticity_2  intervall_2  acticity_3   intervall_3
A   watering    5           remove weeds  7             cut         6            remove leaf  1
B   watering    8           remove weeds  4             cut         2            fertilize    3

我尝试了 spread()transpose() 的变体。但由于我的技能不是那么先进,所以我并没有真正取得任何进展。通过展开和转置,我没有得到任何进一步的结果。

非常感谢您的帮助!!!

这个有用吗:

library(dplyr)
library(tidyr)
data %>% 
  group_by(key) %>% 
  mutate(activity_count = row_number(),
         interval_count = row_number()) %>% 
  pivot_wider(id_cols = key,
              names_from = c(activity_count, interval_count),
              values_from = c(activity,intervall))
# A tibble: 2 x 9
# Groups:   key [2]
#  key   activity_1_1 activity_2_2 activity_3_3 activity_4_4 intervall_1_1 intervall_2_2 intervall_3_3 intervall_4_4
#  <chr> <chr>        <chr>        <chr>        <chr>                <int>         <int>         <int>         <int>
#1 A     watering     remove weeds cut          remove leaf              5             7             6             1
#2 B     watering     remove weeds cut          fertilize                8             4             2             3

这是使用 reshape

的基础 R 选项
reshape(
  within(data, q <- ave(seq_along(key), key, FUN = seq_along)),
  direction = "wide",
  idvar = "key",
  timevar = "q"
)

这给出了

  key acitity.1 intervall.1    acitity.2 intervall.2 acitity.3 intervall.3
1   A  watering           5 remove weeds           7       cut           6
5   B  watering           8 remove weeds           4       cut           2
    acitity.4 intervall.4
1 remove leaf           1
5   fertilize           3

第三个选项使用 data.table 中的 dcast。我们用 rowid(key):

创建缺失的 'time variable'
library(data.table)
# convert data to a data.table object
setDT(data)
# reshape
dcast(data, key  ~ rowid(key), value.var = c("acitity", "intervall"))

结果

#    key acitity_1    acitity_2 acitity_3   acitity_4 intervall_1 intervall_2 intervall_3 intervall_4
#1:   A  watering remove weeds       cut remove leaf           5           7           6           1
#2:   B  watering remove weeds       cut   fertilize           8           4           2           3