如何在 R 中重塑以下数据框?
How do I reshape the following data frame in R?
我有一个如下所示的数据集。我正在尝试编写 R 代码来转换它。这是自我网络,这意味着第一列有两个人列出了他们的关系(在 A1、A2 和 A3 列中)。然后在第 5 列到第 10 列中,我有 A1、A2 和 A3 中的人之间的相互关系:
d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d
我的挑战是让第 2 列到第 10 列看起来像这样
ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
"Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
ReshapedData
我不需要自我名称,至少在这个阶段是这样。关键是先拿到其他东西。到目前为止,我能想到的最好的办法是转置每行中的第 5-10 列,然后使用 rbind 创建一个长列,然后将其与 A1、A2、A3 中的更改列表进行 cbind。这必须是一些更精简的方式来管理它。
谢谢
博格丹
使用 reshape
中的 melt()
函数打包并匹配具有公共索引的项目:
d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d
library(reshape)
a <- melt(d,id.vars=NULL,measure.vars = c("A1","A2","A3"))
a$match <- as.character(paste(a[,1],rep(1:2)))
b <- melt(d,id.vars=NULL,measure.vars = c(5:dim(df)[2]))
b$match <- as.character(paste(gsub(pattern = ".*A([0-9]+).*",replacement = "A\1",x = b[,1]),
rep(1:2)))
df.final <- data.frame(Alter=a$value[match(b$match,a$match)], Alter_Alter=b$value)
index <- 1:dim(df.final)[1]
index <- matrix(1:dim(df.final)[1], nrow = dim(df.final)[1]/2,byrow = T)
df.final <- df.final[as.vector(index),]
df.final
Alter Alter_Alter
1 John Sam
3 John Sally
5 Sam John
7 Sam NA
9 Sally Sam
11 Sally NA
2 Jim Jane
4 Jim NA
6 Tom Jim
8 Tom Jane
10 Jane NA
12 Jane Tom
# Test
ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
"Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
df.final==ReshapedData
Alter Alter_Alter
1 TRUE TRUE
3 TRUE TRUE
5 TRUE TRUE
7 TRUE TRUE
9 TRUE TRUE
11 TRUE TRUE
2 TRUE TRUE
4 TRUE TRUE
6 TRUE TRUE
8 TRUE TRUE
10 TRUE TRUE
12 TRUE TRUE
我有一个如下所示的数据集。我正在尝试编写 R 代码来转换它。这是自我网络,这意味着第一列有两个人列出了他们的关系(在 A1、A2 和 A3 列中)。然后在第 5 列到第 10 列中,我有 A1、A2 和 A3 中的人之间的相互关系:
d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d
我的挑战是让第 2 列到第 10 列看起来像这样
ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
"Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
ReshapedData
我不需要自我名称,至少在这个阶段是这样。关键是先拿到其他东西。到目前为止,我能想到的最好的办法是转置每行中的第 5-10 列,然后使用 rbind 创建一个长列,然后将其与 A1、A2、A3 中的更改列表进行 cbind。这必须是一些更精简的方式来管理它。
谢谢
博格丹
使用 reshape
中的 melt()
函数打包并匹配具有公共索引的项目:
d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d
library(reshape)
a <- melt(d,id.vars=NULL,measure.vars = c("A1","A2","A3"))
a$match <- as.character(paste(a[,1],rep(1:2)))
b <- melt(d,id.vars=NULL,measure.vars = c(5:dim(df)[2]))
b$match <- as.character(paste(gsub(pattern = ".*A([0-9]+).*",replacement = "A\1",x = b[,1]),
rep(1:2)))
df.final <- data.frame(Alter=a$value[match(b$match,a$match)], Alter_Alter=b$value)
index <- 1:dim(df.final)[1]
index <- matrix(1:dim(df.final)[1], nrow = dim(df.final)[1]/2,byrow = T)
df.final <- df.final[as.vector(index),]
df.final
Alter Alter_Alter
1 John Sam
3 John Sally
5 Sam John
7 Sam NA
9 Sally Sam
11 Sally NA
2 Jim Jane
4 Jim NA
6 Tom Jim
8 Tom Jane
10 Jane NA
12 Jane Tom
# Test
ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
"Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
df.final==ReshapedData
Alter Alter_Alter
1 TRUE TRUE
3 TRUE TRUE
5 TRUE TRUE
7 TRUE TRUE
9 TRUE TRUE
11 TRUE TRUE
2 TRUE TRUE
4 TRUE TRUE
6 TRUE TRUE
8 TRUE TRUE
10 TRUE TRUE
12 TRUE TRUE