R中的宽到长格式
Wide to long format in R
下面是我的数据集的一个片段:
> head(df)
Product Region Sector Type Date Value
Product A Capital Primary Continued 2012-01-01 395
Product C Capital Primary Continued 2012-01-01 37
Product D Capital Primary Continued 2012-01-01 208
Product A Central Primary Continued 2012-01-01 343
Product C Central Primary Continued 2012-01-01 1
Product D Central Primary Continued 2012-01-01 80
> tail(df)
Product Region Sector Type Date Value
Product C Southern Unknown New 2014-12-01 11
Product D Southern Unknown New 2014-12-01 18
Product A Zealand Unknown New 2014-12-01 19
Product B Zealand Unknown New 2014-12-01 10
Product C Zealand Unknown New 2014-12-01 9
Product D Zealand Unknown New 2014-12-01 6
我有从 2012-01-01 到 2014-12-01 的 12 个日期和变量的几个因素。我想推断这个数据集,即。在 2014-12-01 之后添加一些额外的随机观察。我最初的想法是使用dcast,例如:
dcast(df, Date ~ Product + Region + Type + Sector)
为了得到所有因素的组合。这将产生一个包含 12 行(日期)和 118 列(所有因素的所有组合)的数据框。然后我可以只向这个数据框添加一些行,然后使用 melt 将它转换回来。但这似乎不太可能。还有其他方法吗?
您可以只使用 rbind
- 只需确保变量名称相同:
df <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"),
Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
Date = c("2012-01-01", "2013-01-01", "2014-12-01"), Value = c(397, 3, 456))
newdata <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"),
Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
Date = c("2014-12-01", "2014-12-02", "2014-12-03"), Value = c(1, 2, 3))
all(colnames(df) == colnames(newdata))
[1] TRUE
combined <- rbind(df, newdata)
combined
Product Region Sector Type Date Value
1 Product A Capital Primary Continued 2012-01-01 397
2 Product B Capital Primary Continued 2013-01-01 3
3 Product C Capital Primary Continued 2014-12-01 456
4 Product A Capital Primary Continued 2014-12-01 1
5 Product B Capital Primary Continued 2014-12-02 2
6 Product C Capital Primary Continued 2014-12-03 3
下面是我的数据集的一个片段:
> head(df)
Product Region Sector Type Date Value
Product A Capital Primary Continued 2012-01-01 395
Product C Capital Primary Continued 2012-01-01 37
Product D Capital Primary Continued 2012-01-01 208
Product A Central Primary Continued 2012-01-01 343
Product C Central Primary Continued 2012-01-01 1
Product D Central Primary Continued 2012-01-01 80
> tail(df)
Product Region Sector Type Date Value
Product C Southern Unknown New 2014-12-01 11
Product D Southern Unknown New 2014-12-01 18
Product A Zealand Unknown New 2014-12-01 19
Product B Zealand Unknown New 2014-12-01 10
Product C Zealand Unknown New 2014-12-01 9
Product D Zealand Unknown New 2014-12-01 6
我有从 2012-01-01 到 2014-12-01 的 12 个日期和变量的几个因素。我想推断这个数据集,即。在 2014-12-01 之后添加一些额外的随机观察。我最初的想法是使用dcast,例如:
dcast(df, Date ~ Product + Region + Type + Sector)
为了得到所有因素的组合。这将产生一个包含 12 行(日期)和 118 列(所有因素的所有组合)的数据框。然后我可以只向这个数据框添加一些行,然后使用 melt 将它转换回来。但这似乎不太可能。还有其他方法吗?
您可以只使用 rbind
- 只需确保变量名称相同:
df <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"),
Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
Date = c("2012-01-01", "2013-01-01", "2014-12-01"), Value = c(397, 3, 456))
newdata <- data.frame(Product = c("Product A", "Product B", "Product C"), Region = c("Capital", "Capital", "Capital"),
Sector = c("Primary", "Primary", "Primary"), Type = c("Continued", "Continued", "Continued"),
Date = c("2014-12-01", "2014-12-02", "2014-12-03"), Value = c(1, 2, 3))
all(colnames(df) == colnames(newdata))
[1] TRUE
combined <- rbind(df, newdata)
combined
Product Region Sector Type Date Value
1 Product A Capital Primary Continued 2012-01-01 397
2 Product B Capital Primary Continued 2013-01-01 3
3 Product C Capital Primary Continued 2014-12-01 456
4 Product A Capital Primary Continued 2014-12-01 1
5 Product B Capital Primary Continued 2014-12-02 2
6 Product C Capital Primary Continued 2014-12-03 3