为长数据库进行宽变换,在 R 中对变量进行分组
Transforming Wide for Long Database, Grouping Variables in R
我想修改我的数据库。我想要做的是将我的数据库的每一行都转换为一个人。
我希望它看起来像这样。示例:
ID massaseca Fator Anexo Teor Tempo
1 334,68 AM c 0 0
1 344,19 AM c 0 10
1 347,32 AM c 0 20
1 350,2 AM c 0 30
1 352,52 AM c 0 40
. . . . . .
. . . . . .
虽然在论坛上找了一些例子,但还是无法解决我的问题。我将介绍这两种尝试。当我执行下面的代码时,我发现它覆盖了一些变量并且不是你想要的方式。
####################################
require(reshape2)
long <- melt(dados1, id.vars = c("Teor", "Fator"))
melt(dados1, id.vars = 1:2)
melt(dados1, measure.vars = 4:45)
melt(dados1, measure.vars = as.character(10,20,30,
40,50,60,70,80,
90,105,120,135,150,
170,190,210,230,250,
270,290,310,330,350,
360,405,450,510,570,
630,690,750,810,870,
930,990,1050,1110,1170,
1230,1290,1350))
################# CASE 2 ###########
####################################
library(data.table)
long1 <- melt(setDT(dados1), id.vars = c("Teor", "Fator"), variable.name = "Tempo")
long1
melt(setDT(dados1), id.vars = 1:2, variable.name = "Tempo")
melt(setDT(dados1), measure.vars = 4:45, variable.name = "Tempo")
melt(setDT(dados1), measure.vars = as.character(10,20,30,
40,50,60,70,80,
90,105,120,135,150,
170,190,210,230,250,
270,290,310,330,350,
360,405,450,510,570,
630,690,750,810,870,
930,990,1050,1110,1170,
1230,1290,1350), variable.name = "Tempo")
但是,上面的例子是将一个变量填充到另一个变量下面,怎么办?
似乎缺少明确的 "id"
列,您可以将其替换为 cbind
行号。然后你可以使用 base reshape
,我们从列名中获取 times
参数的值。 (请注意,如果您像我一样将 id
设置为第一列,则需要将 1
添加到 varying
中。)
res <- reshape(cbind(id=1:nrow(dados1), dados1),
varying=5:46,
v.names="massaseca",
timevar="Tempo",
times=as.numeric(gsub("X", "", tail(names(dados1), -3))),
direction="long", sep="")
结果
head(res[order(res$id), ])
# id Teor Fator Anexo Tempo massaseca
# 1.0 1 0 Am c 0 334,68
# 1.10 1 0 Am c 10 344,19
# 1.20 1 0 Am c 20 347,32
# 1.30 1 0 Am c 30 350,2
# 1.40 1 0 Am c 40 352,52
# 1.50 1 0 Am c 50 354,81
我想修改我的数据库。我想要做的是将我的数据库的每一行都转换为一个人。
我希望它看起来像这样。示例:
ID massaseca Fator Anexo Teor Tempo
1 334,68 AM c 0 0
1 344,19 AM c 0 10
1 347,32 AM c 0 20
1 350,2 AM c 0 30
1 352,52 AM c 0 40
. . . . . .
. . . . . .
虽然在论坛上找了一些例子,但还是无法解决我的问题。我将介绍这两种尝试。当我执行下面的代码时,我发现它覆盖了一些变量并且不是你想要的方式。
####################################
require(reshape2)
long <- melt(dados1, id.vars = c("Teor", "Fator"))
melt(dados1, id.vars = 1:2)
melt(dados1, measure.vars = 4:45)
melt(dados1, measure.vars = as.character(10,20,30,
40,50,60,70,80,
90,105,120,135,150,
170,190,210,230,250,
270,290,310,330,350,
360,405,450,510,570,
630,690,750,810,870,
930,990,1050,1110,1170,
1230,1290,1350))
################# CASE 2 ###########
####################################
library(data.table)
long1 <- melt(setDT(dados1), id.vars = c("Teor", "Fator"), variable.name = "Tempo")
long1
melt(setDT(dados1), id.vars = 1:2, variable.name = "Tempo")
melt(setDT(dados1), measure.vars = 4:45, variable.name = "Tempo")
melt(setDT(dados1), measure.vars = as.character(10,20,30,
40,50,60,70,80,
90,105,120,135,150,
170,190,210,230,250,
270,290,310,330,350,
360,405,450,510,570,
630,690,750,810,870,
930,990,1050,1110,1170,
1230,1290,1350), variable.name = "Tempo")
但是,上面的例子是将一个变量填充到另一个变量下面,怎么办?
似乎缺少明确的 "id"
列,您可以将其替换为 cbind
行号。然后你可以使用 base reshape
,我们从列名中获取 times
参数的值。 (请注意,如果您像我一样将 id
设置为第一列,则需要将 1
添加到 varying
中。)
res <- reshape(cbind(id=1:nrow(dados1), dados1),
varying=5:46,
v.names="massaseca",
timevar="Tempo",
times=as.numeric(gsub("X", "", tail(names(dados1), -3))),
direction="long", sep="")
结果
head(res[order(res$id), ])
# id Teor Fator Anexo Tempo massaseca
# 1.0 1 0 Am c 0 334,68
# 1.10 1 0 Am c 10 344,19
# 1.20 1 0 Am c 20 347,32
# 1.30 1 0 Am c 30 350,2
# 1.40 1 0 Am c 40 352,52
# 1.50 1 0 Am c 50 354,81