创建一个简单的代码,用于在 R 中转换我的数据集,从宽到长

Creating a simple code for transforming my dataset in R, from wide to long

这可能是一个简单的问题,但我完全没有用 R 编写代码的诀窍。 我有一个右删失的数据集,看起来像这样:

dput(head(books)):

structure(list(id = 1:6, time = c(29, 30, 26, 30, 30, 29
), event = c(1, 0, 1, 0, 0, 1), z1 = c("early", "late", 
"early", "late", "late", "early"), z2 = c(9, 6, 4, 9, 
9, 5), z3 = c(0B, 1B, 0C, 0C, 0C, 0C), burrowed = c(1, 
1, 1, 0, 1, 1), time.burrowed = c(5, 2, 6, 30, 1, 8), 
    returned = c(1, 0, 0, 0, 1, 0), time.returned = c(20, 30, 21, 
    30, 28, 29)), row.names = c(NA, 6L), class = "data.frame")

我需要它看起来像这样吗?


head(books)

id     start     stop     checkedout     event     z1     z2
1       0         5           0           0       early   9
1       5        20           1           0       early   9
1      20        30           0           1       early   9   etc.
2    
3
4
4

基本上是把借的和还的合并到这几个时间是否退房

我到目前为止...

    start <- 0
     stop <- numeric(length=0)
     checkedout <- 0
     event <- numeric(length=0)
     if (book$burrowed[1]==1) {
     start <- c(start, book$time.burrowed[1])
     stop <- c(stop, book$time.burrowed[1])
     checkedout <- c(checkedout,1)
     event <- c(event, 0)
     }
     if (book$returned[1]==1) {
     start <- c(start, book$time.returned[1])
     stop <- c(stop, book$time.returned[1])
     checkedout <- c(checkedout,0)
     event <- c(event, 0)
     }
     stop <- c(stop, book$time[1])
     event <- c(event, book$event[1])
     temp.frame <- data.frame(id=book$id[1],start,stop,event,checkedout)

感谢提供数据,很有帮助。但是,对于如何设置特定列仍然存在不确定性,例如 event(代码的结果与上面的示例看起来不同)。

以下内容可能会帮助您入门。我猜您会想要添加特定的条件规则来帮助进一步塑造结果的外观。

首先,我将创建一个函数来处理单行数据:

my_fun <- function(x) {
  df <- as.data.frame(rbind(
    c(id = x[["id"]], start = 0, stop = x[["time.burrowed"]], event = 0, checkedout = 0),
    c(id = x[["id"]], start = x[["time.burrowed"]], stop = x[["time.returned"]], event = 0, checkedout = 1),
    c(id = x[["id"]], start = x[["time.returned"]], stop = x[["time"]], event = x[["event"]], checkedout = 0)
  ))
  df <- cbind(
    df,
    z1 = x[["z1"]],
    z2 = x[["z2"]],
    z3 = x[["z3"]]
  )
  return(df)
}

在这个函数中,你可以清楚地表明你想要将单行数据转换为三行数据。具体来说,指出 startstop 应该基于 time.burrowed(拼写错误?)和 time.returned。此外,您可以将第一个 start 硬编码为 0,将最后一个 stop 硬编码为 time。在这里,您还可以为 eventcheckedout 指明您想要的内容。我根据前两行的示例代码将 event 设置为 0,然后将 event 设置为第三行; checkedout 仅在中间行为 1。最后三行合并为rbind.

之后,可以用cbind添加其他列。这包括 z1z2 等,它们在各行中显示为常量。

你可以用单行试试这个功能,比如:

R> my_fun(book[1,])
  id start stop event checkedout    z1 z2 z3
1  1     0    5     0          0 early  9 0B
2  1     5   20     0          1 early  9 0B
3  1    20   29     1          0 early  9 0B

一旦您对函数感到满意,就可以将该函数应用于数据框中的所有行:

do.call(rbind, lapply(1:nrow(book), function(x) my_fun(book[x,])))

输出

   id start stop event checkedout    z1 z2 z3
1   1     0    5     0          0 early  9 0B
2   1     5   20     0          1 early  9 0B
3   1    20   29     1          0 early  9 0B
4   2     0    2     0          0  late  6 1B
5   2     2   30     0          1  late  6 1B
6   2    30   30     0          0  late  6 1B
7   3     0    6     0          0 early  4 0C
8   3     6   21     0          1 early  4 0C
9   3    21   26     1          0 early  4 0C
10  4     0   30     0          0  late  9 0C
11  4    30   30     0          1  late  9 0C
12  4    30   30     0          0  late  9 0C
13  5     0    1     0          0  late  9 0C
14  5     1   28     0          1  late  9 0C
15  5    28   30     0          0  late  9 0C
16  6     0    8     0          0 early  5 0C
17  6     8   29     0          1 early  5 0C
18  6    29   29     1          0 early  5 0C