创建一个简单的代码,用于在 R 中转换我的数据集,从宽到长
Creating a simple code for transforming my dataset in R, from wide to long
这可能是一个简单的问题,但我完全没有用 R 编写代码的诀窍。
我有一个右删失的数据集,看起来像这样:
dput(head(books)):
structure(list(id = 1:6, time = c(29, 30, 26, 30, 30, 29
), event = c(1, 0, 1, 0, 0, 1), z1 = c("early", "late",
"early", "late", "late", "early"), z2 = c(9, 6, 4, 9,
9, 5), z3 = c(0B, 1B, 0C, 0C, 0C, 0C), burrowed = c(1,
1, 1, 0, 1, 1), time.burrowed = c(5, 2, 6, 30, 1, 8),
returned = c(1, 0, 0, 0, 1, 0), time.returned = c(20, 30, 21,
30, 28, 29)), row.names = c(NA, 6L), class = "data.frame")
我需要它看起来像这样吗?
head(books)
id start stop checkedout event z1 z2
1 0 5 0 0 early 9
1 5 20 1 0 early 9
1 20 30 0 1 early 9 etc.
2
3
4
4
基本上是把借的和还的合并到这几个时间是否退房
我到目前为止...
start <- 0
stop <- numeric(length=0)
checkedout <- 0
event <- numeric(length=0)
if (book$burrowed[1]==1) {
start <- c(start, book$time.burrowed[1])
stop <- c(stop, book$time.burrowed[1])
checkedout <- c(checkedout,1)
event <- c(event, 0)
}
if (book$returned[1]==1) {
start <- c(start, book$time.returned[1])
stop <- c(stop, book$time.returned[1])
checkedout <- c(checkedout,0)
event <- c(event, 0)
}
stop <- c(stop, book$time[1])
event <- c(event, book$event[1])
temp.frame <- data.frame(id=book$id[1],start,stop,event,checkedout)
感谢提供数据,很有帮助。但是,对于如何设置特定列仍然存在不确定性,例如 event
(代码的结果与上面的示例看起来不同)。
以下内容可能会帮助您入门。我猜您会想要添加特定的条件规则来帮助进一步塑造结果的外观。
首先,我将创建一个函数来处理单行数据:
my_fun <- function(x) {
df <- as.data.frame(rbind(
c(id = x[["id"]], start = 0, stop = x[["time.burrowed"]], event = 0, checkedout = 0),
c(id = x[["id"]], start = x[["time.burrowed"]], stop = x[["time.returned"]], event = 0, checkedout = 1),
c(id = x[["id"]], start = x[["time.returned"]], stop = x[["time"]], event = x[["event"]], checkedout = 0)
))
df <- cbind(
df,
z1 = x[["z1"]],
z2 = x[["z2"]],
z3 = x[["z3"]]
)
return(df)
}
在这个函数中,你可以清楚地表明你想要将单行数据转换为三行数据。具体来说,指出 start
和 stop
应该基于 time.burrowed
(拼写错误?)和 time.returned
。此外,您可以将第一个 start
硬编码为 0,将最后一个 stop
硬编码为 time
。在这里,您还可以为 event
和 checkedout
指明您想要的内容。我根据前两行的示例代码将 event
设置为 0,然后将 event
设置为第三行; checkedout
仅在中间行为 1。最后三行合并为rbind
.
之后,可以用cbind
添加其他列。这包括 z1
、z2
等,它们在各行中显示为常量。
你可以用单行试试这个功能,比如:
R> my_fun(book[1,])
id start stop event checkedout z1 z2 z3
1 1 0 5 0 0 early 9 0B
2 1 5 20 0 1 early 9 0B
3 1 20 29 1 0 early 9 0B
一旦您对函数感到满意,就可以将该函数应用于数据框中的所有行:
do.call(rbind, lapply(1:nrow(book), function(x) my_fun(book[x,])))
输出
id start stop event checkedout z1 z2 z3
1 1 0 5 0 0 early 9 0B
2 1 5 20 0 1 early 9 0B
3 1 20 29 1 0 early 9 0B
4 2 0 2 0 0 late 6 1B
5 2 2 30 0 1 late 6 1B
6 2 30 30 0 0 late 6 1B
7 3 0 6 0 0 early 4 0C
8 3 6 21 0 1 early 4 0C
9 3 21 26 1 0 early 4 0C
10 4 0 30 0 0 late 9 0C
11 4 30 30 0 1 late 9 0C
12 4 30 30 0 0 late 9 0C
13 5 0 1 0 0 late 9 0C
14 5 1 28 0 1 late 9 0C
15 5 28 30 0 0 late 9 0C
16 6 0 8 0 0 early 5 0C
17 6 8 29 0 1 early 5 0C
18 6 29 29 1 0 early 5 0C
这可能是一个简单的问题,但我完全没有用 R 编写代码的诀窍。 我有一个右删失的数据集,看起来像这样:
dput(head(books)):
structure(list(id = 1:6, time = c(29, 30, 26, 30, 30, 29
), event = c(1, 0, 1, 0, 0, 1), z1 = c("early", "late",
"early", "late", "late", "early"), z2 = c(9, 6, 4, 9,
9, 5), z3 = c(0B, 1B, 0C, 0C, 0C, 0C), burrowed = c(1,
1, 1, 0, 1, 1), time.burrowed = c(5, 2, 6, 30, 1, 8),
returned = c(1, 0, 0, 0, 1, 0), time.returned = c(20, 30, 21,
30, 28, 29)), row.names = c(NA, 6L), class = "data.frame")
我需要它看起来像这样吗?
head(books)
id start stop checkedout event z1 z2
1 0 5 0 0 early 9
1 5 20 1 0 early 9
1 20 30 0 1 early 9 etc.
2
3
4
4
基本上是把借的和还的合并到这几个时间是否退房
我到目前为止...
start <- 0
stop <- numeric(length=0)
checkedout <- 0
event <- numeric(length=0)
if (book$burrowed[1]==1) {
start <- c(start, book$time.burrowed[1])
stop <- c(stop, book$time.burrowed[1])
checkedout <- c(checkedout,1)
event <- c(event, 0)
}
if (book$returned[1]==1) {
start <- c(start, book$time.returned[1])
stop <- c(stop, book$time.returned[1])
checkedout <- c(checkedout,0)
event <- c(event, 0)
}
stop <- c(stop, book$time[1])
event <- c(event, book$event[1])
temp.frame <- data.frame(id=book$id[1],start,stop,event,checkedout)
感谢提供数据,很有帮助。但是,对于如何设置特定列仍然存在不确定性,例如 event
(代码的结果与上面的示例看起来不同)。
以下内容可能会帮助您入门。我猜您会想要添加特定的条件规则来帮助进一步塑造结果的外观。
首先,我将创建一个函数来处理单行数据:
my_fun <- function(x) {
df <- as.data.frame(rbind(
c(id = x[["id"]], start = 0, stop = x[["time.burrowed"]], event = 0, checkedout = 0),
c(id = x[["id"]], start = x[["time.burrowed"]], stop = x[["time.returned"]], event = 0, checkedout = 1),
c(id = x[["id"]], start = x[["time.returned"]], stop = x[["time"]], event = x[["event"]], checkedout = 0)
))
df <- cbind(
df,
z1 = x[["z1"]],
z2 = x[["z2"]],
z3 = x[["z3"]]
)
return(df)
}
在这个函数中,你可以清楚地表明你想要将单行数据转换为三行数据。具体来说,指出 start
和 stop
应该基于 time.burrowed
(拼写错误?)和 time.returned
。此外,您可以将第一个 start
硬编码为 0,将最后一个 stop
硬编码为 time
。在这里,您还可以为 event
和 checkedout
指明您想要的内容。我根据前两行的示例代码将 event
设置为 0,然后将 event
设置为第三行; checkedout
仅在中间行为 1。最后三行合并为rbind
.
之后,可以用cbind
添加其他列。这包括 z1
、z2
等,它们在各行中显示为常量。
你可以用单行试试这个功能,比如:
R> my_fun(book[1,])
id start stop event checkedout z1 z2 z3
1 1 0 5 0 0 early 9 0B
2 1 5 20 0 1 early 9 0B
3 1 20 29 1 0 early 9 0B
一旦您对函数感到满意,就可以将该函数应用于数据框中的所有行:
do.call(rbind, lapply(1:nrow(book), function(x) my_fun(book[x,])))
输出
id start stop event checkedout z1 z2 z3
1 1 0 5 0 0 early 9 0B
2 1 5 20 0 1 early 9 0B
3 1 20 29 1 0 early 9 0B
4 2 0 2 0 0 late 6 1B
5 2 2 30 0 1 late 6 1B
6 2 30 30 0 0 late 6 1B
7 3 0 6 0 0 early 4 0C
8 3 6 21 0 1 early 4 0C
9 3 21 26 1 0 early 4 0C
10 4 0 30 0 0 late 9 0C
11 4 30 30 0 1 late 9 0C
12 4 30 30 0 0 late 9 0C
13 5 0 1 0 0 late 9 0C
14 5 1 28 0 1 late 9 0C
15 5 28 30 0 0 late 9 0C
16 6 0 8 0 0 early 5 0C
17 6 8 29 0 1 early 5 0C
18 6 29 29 1 0 early 5 0C