将重叠列表 data.frames 转换为单个列表 data.frame
Convert list of overlapping data.frames into single data.frame
我在列表中有来自多个队列的一些人口信息。每个群组涵盖重叠的时间段。数据如下所示:
> raw.data
$`1`
Year Pop
1 1920 1927433
2 1921 1914551
3 1922 1900776
$`2`
Year Pop
1 1921 1915576
2 1922 1902075
3 1923 1887613
$`3`
Year Pop
1 1922 1902111
2 1923 1887862
3 1924 1872695
我想将其转换为单个数据框,其中列名是年份,对角线是人口数据。输出应如下所示:
> resulting.data
1920 1921 1922 1923 1924
1 1927433 1915576 1902111 NA NA
2 NA 1914551 1902075 1887862 NA
3 NA NA 1900776 1887613 1872695
您可以在下面找到输入和所需输出的示例数据:
raw.data <- structure(list(`1` = structure(list(Year = 1920:1922, Pop = c(1927433L, 1914551L, 1900776L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `2` = structure(list(Year = 1921:1923, Pop = c(1915576L, 1902075L, 1887613L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `3` = structure(list(Year = 1922:1924, Pop = c(1902111L, 1887862L, 1872695L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame")), .Names = c("1", "2", "3"))
resulting.data <- structure(list(X1920 = c(1927433, NA, NA), X1921 = c(1915576, 1914551, NA), X1922 = c(1902111, 1902075, 1900776), X1923 = c(NA, 1887862, 1887613), X1924 = c(NA, NA, 1872695)), .Names = c("X1920", "X1921", "X1922", "X1923", "X1924"), row.names = c(NA, -3L), class = "data.frame")
我查看了 ,它提供了一个类似的问题,但我未能根据自己的需要调整它。我还尝试使用 plyr 尝试先获取对角线,然后将它们组合起来,但我不确定如何进行组合。
使用 do.call()
和 rbind()
将数据转换为单个数据帧,然后 reshape2::dcast()
进行整形:
dat <- do.call(rbind, raw.data)
dat$obs <- gsub(".*?\.", "", row.names(dat))
library(reshape2)
dcast(dat, obs ~ Year, fun.aggregate = sum, value.var = "Pop")
obs 1920 1921 1922 1923 1924
1 1 1927433 1915576 1902111 0 0
2 2 0 1914551 1902075 1887862 0
3 3 0 0 1900776 1887613 1872695
我在列表中有来自多个队列的一些人口信息。每个群组涵盖重叠的时间段。数据如下所示:
> raw.data
$`1`
Year Pop
1 1920 1927433
2 1921 1914551
3 1922 1900776
$`2`
Year Pop
1 1921 1915576
2 1922 1902075
3 1923 1887613
$`3`
Year Pop
1 1922 1902111
2 1923 1887862
3 1924 1872695
我想将其转换为单个数据框,其中列名是年份,对角线是人口数据。输出应如下所示:
> resulting.data
1920 1921 1922 1923 1924
1 1927433 1915576 1902111 NA NA
2 NA 1914551 1902075 1887862 NA
3 NA NA 1900776 1887613 1872695
您可以在下面找到输入和所需输出的示例数据:
raw.data <- structure(list(`1` = structure(list(Year = 1920:1922, Pop = c(1927433L, 1914551L, 1900776L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `2` = structure(list(Year = 1921:1923, Pop = c(1915576L, 1902075L, 1887613L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame"), `3` = structure(list(Year = 1922:1924, Pop = c(1902111L, 1887862L, 1872695L)), .Names = c("Year", "Pop"), row.names = c(NA, 3L), class = "data.frame")), .Names = c("1", "2", "3"))
resulting.data <- structure(list(X1920 = c(1927433, NA, NA), X1921 = c(1915576, 1914551, NA), X1922 = c(1902111, 1902075, 1900776), X1923 = c(NA, 1887862, 1887613), X1924 = c(NA, NA, 1872695)), .Names = c("X1920", "X1921", "X1922", "X1923", "X1924"), row.names = c(NA, -3L), class = "data.frame")
我查看了
使用 do.call()
和 rbind()
将数据转换为单个数据帧,然后 reshape2::dcast()
进行整形:
dat <- do.call(rbind, raw.data)
dat$obs <- gsub(".*?\.", "", row.names(dat))
library(reshape2)
dcast(dat, obs ~ Year, fun.aggregate = sum, value.var = "Pop")
obs 1920 1921 1922 1923 1924
1 1 1927433 1915576 1902111 0 0
2 2 0 1914551 1902075 1887862 0
3 3 0 0 1900776 1887613 1872695