试图一次使数据宽 4 列
Trying to make data wide 4 columns at once
我正在努力使我的数据比现在更宽。我尝试使用传播,但我想一次传播 4 个变量。样本数据集是:
df <- data.frame(Year <- c("2017","2018"),
ID <- c(1,1),
Score <- c("21","32"),
Score2 <- c("24","20"),
Score3 <- c("33", "26"),
Score4 <- c("25","32"))
Year ID Score Score2 Score3 Score4
1 2017 1 21 24 33 25
2 2018 1 32 20 26 32
我想让它变宽,这样两年的所有分数都在 1 行,如下所示:
Year Score Score2 Score3 Score4 Year2 Score18 Score218 Score318 Score418
1 2017 21 24 33 25 2018 32 20 26 32
"Year2" 专栏不是完全必要的,但我想在 2017 年和 2018 年之间找到一些破译方法。
如有任何帮助或指导,我们将不胜感激!谢谢!
我们可以使用 data.table
中的 dcast
library(data.table)
dcast(setDT(df), ID ~ rowid(ID), value.var = setdiff(names(df), 'ID'),
sep="")[, ID := NULL][]
# Year1 Year2 Score1 Score2 Score21 Score22 Score31 Score32 Score41 Score42
#1: 2017 2018 21 32 24 20 33 26 25 32
或 reshape
来自 base R
reshape(transform(df, tvar = seq_len(nrow(df))),
idvar = 'ID', direction = 'wide', timevar = 'tvar')[-1]
# Year.1 Score.1 Score2.1 Score3.1 Score4.1 Year.2 Score.2 Score2.2 Score3.2 Score4.2
#1 2017 21 24 33 25 2018 32 20 26 32
数据
df <- data.frame(Year = c(2017, 2018),
ID = c(1,1),
Score = c(21,32),
Score2 = c(24,20),
Score3 = c(33, 26),
Score4= c(25, 32))
另一种方法可能是
library(tidyverse)
library(splitstackshape)
df %>%
group_by(ID) %>%
summarise_all(funs(toString)) %>%
cSplit(names(.)[-1], ",")
输出为:
ID Year_1 Year_2 Score_1 Score_2 Score2_1 Score2_2 Score3_1 Score3_2 Score4_1 Score4_2
1: 1 2017 2018 21 32 24 20 33 26 25 32
示例数据:
df <- data.frame(Year = c("2017","2018"),
ID = c(1,1),
Score = c("21","32"),
Score2 = c("24","20"),
Score3 = c("33", "26"),
Score4 = c("25","32"))
我正在努力使我的数据比现在更宽。我尝试使用传播,但我想一次传播 4 个变量。样本数据集是:
df <- data.frame(Year <- c("2017","2018"),
ID <- c(1,1),
Score <- c("21","32"),
Score2 <- c("24","20"),
Score3 <- c("33", "26"),
Score4 <- c("25","32"))
Year ID Score Score2 Score3 Score4
1 2017 1 21 24 33 25
2 2018 1 32 20 26 32
我想让它变宽,这样两年的所有分数都在 1 行,如下所示:
Year Score Score2 Score3 Score4 Year2 Score18 Score218 Score318 Score418
1 2017 21 24 33 25 2018 32 20 26 32
"Year2" 专栏不是完全必要的,但我想在 2017 年和 2018 年之间找到一些破译方法。
如有任何帮助或指导,我们将不胜感激!谢谢!
我们可以使用 data.table
dcast
library(data.table)
dcast(setDT(df), ID ~ rowid(ID), value.var = setdiff(names(df), 'ID'),
sep="")[, ID := NULL][]
# Year1 Year2 Score1 Score2 Score21 Score22 Score31 Score32 Score41 Score42
#1: 2017 2018 21 32 24 20 33 26 25 32
或 reshape
来自 base R
reshape(transform(df, tvar = seq_len(nrow(df))),
idvar = 'ID', direction = 'wide', timevar = 'tvar')[-1]
# Year.1 Score.1 Score2.1 Score3.1 Score4.1 Year.2 Score.2 Score2.2 Score3.2 Score4.2
#1 2017 21 24 33 25 2018 32 20 26 32
数据
df <- data.frame(Year = c(2017, 2018),
ID = c(1,1),
Score = c(21,32),
Score2 = c(24,20),
Score3 = c(33, 26),
Score4= c(25, 32))
另一种方法可能是
library(tidyverse)
library(splitstackshape)
df %>%
group_by(ID) %>%
summarise_all(funs(toString)) %>%
cSplit(names(.)[-1], ",")
输出为:
ID Year_1 Year_2 Score_1 Score_2 Score2_1 Score2_2 Score3_1 Score3_2 Score4_1 Score4_2
1: 1 2017 2018 21 32 24 20 33 26 25 32
示例数据:
df <- data.frame(Year = c("2017","2018"),
ID = c(1,1),
Score = c("21","32"),
Score2 = c("24","20"),
Score3 = c("33", "26"),
Score4 = c("25","32"))