如何修改此 data.table 代码以显示余额转换而不是事件频率转换?
How to modify this data.table code to show balance transitions instead of event frequency transitions?
我正在使用下面的 MWE 代码生成过渡频率的数据帧。它运作良好且快速。我是 data.table
包的新手,无法将其转换为显示平衡转换。
首先,下面是示例数据框,运行函数时的转换频率输出(使用“Period_1”和“Period_2”的两次时间测量) ,以及这些函数的底层 MWE 代码,所有这些代码都按转换频率的预期工作:
> data
ID Period_1 Period_2 Values State
1: 1 1 2020-01 5 X0
2: 1 2 2020-02 10 X1
3: 1 3 2020-03 15 X2
4: 2 1 2020-04 0 X0
5: 2 2 2020-05 2 X2
6: 2 3 2020-06 4 X0
7: 3 1 2020-02 3 X2
8: 3 2 2020-03 6 X1
9: 3 3 2020-04 9 X0
> setDT(data)
> num_transit(data, "2020-02", "2020-04",refvar="Period_2")
to_state X0 X1 X2
1: X0 NA NA 1
2: X1 NA NA NA
3: X2 NA NA NA
> setDT(data)
> num_transit(data, 1,3, refvar="Period_1")
to_state X0 X1 X2
1: X0 1 NA 1
2: X1 NA NA NA
3: X2 1 NA NA
library(data.table)
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period_1 = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Period_2 = c("2020-01","2020-02","2020-03","2020-04","2020-05","2020-06","2020-02","2020-03","2020-04"),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
num_transit <- function(x,from,to,refvar="Period_2", return_matrix=T) {
res <- x[get(refvar) %in% c(to,from), if(.N>1) .SD, by=ID, .SDcols = c(refvar, "State")]
res <- res[, id:=1:.N, by=ID]
res <- dcast(res, ID~id, value.var="State")[,.N, .(`1`,`2`)]
setnames(res,c("from","to", "ct"))
if(return_matrix) return(convert_transits_to_matrix(res, unique(x$State)))
res
}
convert_transits_to_matrix <- function(transits,states) {
m = matrix(NA, nrow=length(states), ncol=length(states), dimnames=list(states,states))
m[as.matrix(transits[,.(to,from)])] <- transits$ct
m = data.table(m)[,to_state:=rownames(m)]
setcolorder(m,"to_state")
return(m[])
}
这是我需要帮助的地方。我正在尝试修改上面的内容(称之为“val_transit”)以显示“值”到新状态的转换。所以输出看起来像这样,使用 data
数据帧和 运行 Period_2 从 1 到 3(或 val_transit(data, 1,3, refvar="Period_1")
)的转换:
to_state X0 X1 X2
1: X0 4 NA 9
2: X1 NA NA NA
3: X2 15 NA NA
有什么建议吗?这是过渡频率 post
的后续
当然,这是对之前 num_transit
功能的更新。注意差异
.SDcols
在函数的第一行包含 State
和 Values
dcast
调用中的 value.vars
包括 State
和 Value
- 作为上面 (2) 的结果,我在
State_1
、State_2
上明确分组,而不是 1
和 2
,总结操作是求和 Values
- 如果
return_matrix=F
,setnames
调用调整为return最后一列为Values
val_transit <- function(x,from,to,refvar="Period_2", return_matrix=T) {
res <- x[get(refvar) %in% c(to,from), if(.N>1) .SD, by=ID, .SDcols = c(refvar, "State", "Values")]
res <- res[, id:=1:.N, by=ID]
res <- dcast(res, ID~id, value.var=c("State", "Values"))[,.(Values=sum(Values_2,na.rm=T)), .(State_1, State_2)]
setnames(res,c("from","to", "Values"))
if(return_matrix) return(convert_transits_to_matrix(res, unique(x$State)))
res
}
注意下面,我对我的 convert_transits_to_matrix
函数做了一个小更新,这样这个辅助函数就可以同时用于 val_transit()
和 num_transit()
。次要更新在第 2 行,我在这里使用 transits[[3]]
,因此无论 transits
对象中的实际第 3 列名称如何,它都能正常工作。
convert_transits_to_matrix <- function(transits,states) {
m = matrix(NA, nrow=length(states), ncol=length(states), dimnames=list(states,states))
m[as.matrix(transits[,.(to,from)])] <- transits[[3]]
m = data.table(m)[,to_state:=rownames(m)]
setcolorder(m,"to_state")
return(m[])
}
用法:
val_transit(data,"2020-02","2020-04", "Period_2")
to_state X0 X1 X2
<char> <num> <num> <num>
1: X0 NA NA 9
2: X1 NA NA NA
3: X2 NA NA NA
val_transit(data,1,3, "Period_1")
to_state X0 X1 X2
<char> <num> <num> <num>
1: X0 4 NA 9
2: X1 NA NA NA
3: X2 15 NA NA
确保您的 data
是 setDT(data),然后再将其提供给这些函数。
我正在使用下面的 MWE 代码生成过渡频率的数据帧。它运作良好且快速。我是 data.table
包的新手,无法将其转换为显示平衡转换。
首先,下面是示例数据框,运行函数时的转换频率输出(使用“Period_1”和“Period_2”的两次时间测量) ,以及这些函数的底层 MWE 代码,所有这些代码都按转换频率的预期工作:
> data
ID Period_1 Period_2 Values State
1: 1 1 2020-01 5 X0
2: 1 2 2020-02 10 X1
3: 1 3 2020-03 15 X2
4: 2 1 2020-04 0 X0
5: 2 2 2020-05 2 X2
6: 2 3 2020-06 4 X0
7: 3 1 2020-02 3 X2
8: 3 2 2020-03 6 X1
9: 3 3 2020-04 9 X0
> setDT(data)
> num_transit(data, "2020-02", "2020-04",refvar="Period_2")
to_state X0 X1 X2
1: X0 NA NA 1
2: X1 NA NA NA
3: X2 NA NA NA
> setDT(data)
> num_transit(data, 1,3, refvar="Period_1")
to_state X0 X1 X2
1: X0 1 NA 1
2: X1 NA NA NA
3: X2 1 NA NA
library(data.table)
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period_1 = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Period_2 = c("2020-01","2020-02","2020-03","2020-04","2020-05","2020-06","2020-02","2020-03","2020-04"),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
num_transit <- function(x,from,to,refvar="Period_2", return_matrix=T) {
res <- x[get(refvar) %in% c(to,from), if(.N>1) .SD, by=ID, .SDcols = c(refvar, "State")]
res <- res[, id:=1:.N, by=ID]
res <- dcast(res, ID~id, value.var="State")[,.N, .(`1`,`2`)]
setnames(res,c("from","to", "ct"))
if(return_matrix) return(convert_transits_to_matrix(res, unique(x$State)))
res
}
convert_transits_to_matrix <- function(transits,states) {
m = matrix(NA, nrow=length(states), ncol=length(states), dimnames=list(states,states))
m[as.matrix(transits[,.(to,from)])] <- transits$ct
m = data.table(m)[,to_state:=rownames(m)]
setcolorder(m,"to_state")
return(m[])
}
这是我需要帮助的地方。我正在尝试修改上面的内容(称之为“val_transit”)以显示“值”到新状态的转换。所以输出看起来像这样,使用 data
数据帧和 运行 Period_2 从 1 到 3(或 val_transit(data, 1,3, refvar="Period_1")
)的转换:
to_state X0 X1 X2
1: X0 4 NA 9
2: X1 NA NA NA
3: X2 15 NA NA
有什么建议吗?这是过渡频率 post
当然,这是对之前 num_transit
功能的更新。注意差异
.SDcols
在函数的第一行包含State
和Values
value.vars
包括State
和Value
- 作为上面 (2) 的结果,我在
State_1
、State_2
上明确分组,而不是1
和2
,总结操作是求和Values
- 如果
return_matrix=F
,
dcast
调用中的 setnames
调用调整为return最后一列为Values
val_transit <- function(x,from,to,refvar="Period_2", return_matrix=T) {
res <- x[get(refvar) %in% c(to,from), if(.N>1) .SD, by=ID, .SDcols = c(refvar, "State", "Values")]
res <- res[, id:=1:.N, by=ID]
res <- dcast(res, ID~id, value.var=c("State", "Values"))[,.(Values=sum(Values_2,na.rm=T)), .(State_1, State_2)]
setnames(res,c("from","to", "Values"))
if(return_matrix) return(convert_transits_to_matrix(res, unique(x$State)))
res
}
注意下面,我对我的 convert_transits_to_matrix
函数做了一个小更新,这样这个辅助函数就可以同时用于 val_transit()
和 num_transit()
。次要更新在第 2 行,我在这里使用 transits[[3]]
,因此无论 transits
对象中的实际第 3 列名称如何,它都能正常工作。
convert_transits_to_matrix <- function(transits,states) {
m = matrix(NA, nrow=length(states), ncol=length(states), dimnames=list(states,states))
m[as.matrix(transits[,.(to,from)])] <- transits[[3]]
m = data.table(m)[,to_state:=rownames(m)]
setcolorder(m,"to_state")
return(m[])
}
用法:
val_transit(data,"2020-02","2020-04", "Period_2")
to_state X0 X1 X2
<char> <num> <num> <num>
1: X0 NA NA 9
2: X1 NA NA NA
3: X2 NA NA NA
val_transit(data,1,3, "Period_1")
to_state X0 X1 X2
<char> <num> <num> <num>
1: X0 4 NA 9
2: X1 NA NA NA
3: X2 15 NA NA
确保您的 data
是 setDT(data),然后再将其提供给这些函数。