使用日期索引和列格式化数据框以将行值合并到每个月类别中
Format Data Frame with Date Index and Column to Consolidate Row Values into Each Month Category
我有一个 data.frame,它有一个日期索引,它在第一列中被复制,并且有许多不同的列和包含数据的相应行。因为包含数据(按日期编制索引)的行可能会有所不同,具体取决于是否收集了相应的数据,有时可能会有一行(日期)包含许多空白并且只有几列的值。我想将这些行折叠成一个月和一年。
即一行可以显示昨天的数据,但不是今天所有列的数据,因为它可能尚未收集。只是看起来美观混乱,宁愿只说 "June-2020" 并折叠它们以删除 NA。
这是 data.frame 的输入:
structure(list(Date = structure(c(18292, 18320, 18321, 18351,
18352, 18382, 18413, 18427, 18428), tzone = "UTC", tclass = "Date", class = "Date"),
`M-o-M Change in Median Rent - AMH GA` = c(0, NA, 0, NA,
0, 0, 0, NA, 0), `Median Advertised Rent - AMH GA` = c(1695,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH GA` = c(438,
NA, 430.5, NA, 450, 458, 385, NA, 331), `M-o-M Change in Median Rent - AMH Charlotte` = c(0,
NA, 0, NA, 0, 0.0272727272727273, 0, NA, 0.0324483775811208
), `Median Advertised Rent - AMH Charlotte` = c(1650, NA,
1650, NA, 1650, 1695, 1695, NA, 1750), `Median of Daily Rental Listings Available - AMH Charlotte` = c(244,
NA, 257, NA, 256, 270, 227, NA, 220), `M-o-M Change in Median Rent - AMH Dallas` = c(0,
NA, 0, NA, 0, 0, 0.0306406685236769, NA, 0), `Median Advertised Rent - AMH Dallas` = c(1795,
NA, 1795, NA, 1795, 1795, 1850, NA, 1850), `Median of Daily Rental Listings Available - AMH Dallas` = c(148,
NA, 150, NA, 166, 152.5, 131, NA, 135), `M-o-M Change in Median Rent - AMH Houston` = c(0,
NA, 0.0272727272727273, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Houston` = c(1650,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Houston` = c(222,
NA, 223, NA, 228, 237.5, 203, NA, 189), `M-o-M Change in Median Rent - AMH Jacksonville` = c(0,
0.00681818181818183, NA, NA, -0.00677200902934538, 0, 0.0272727272727273,
NA, 0), `Median Advertised Rent - AMH Jacksonville` = c(1650,
1661.25, NA, NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Jacksonville` = c(164,
179.5, NA, NA, 188, 195.5, 185, NA, 174), `M-o-M Change in Median Rent - AMH NC` = c(0,
NA, 0.0344827586206897, NA, 0, 0, 0.0272727272727273, NA,
0), `Median Advertised Rent - AMH NC` = c(1595, NA, 1650,
NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH NC` = c(365,
NA, 387, NA, 405, 447, 344, NA, 323), `M-o-M Change in Median Rent - AMH NV` = c(0,
NA, 0, 0, NA, -0.0265486725663717, 0, NA, 0.0272727272727273
), `Median Advertised Rent - AMH NV` = c(1695, NA, 1695,
1695, NA, 1650, 1650, NA, 1695), `Median of Daily Rental Listings Available - AMH NV` = c(62,
NA, 59, 70, NA, 71, 63, NA, 58), `M-o-M Change in Median Rent - AMH Orlando` = c(0,
0.0112676056338028, NA, NA, 0, 0, 0, NA, 0.0306406685236769
), `Median Advertised Rent - AMH Orlando` = c(1775, 1795,
NA, NA, 1795, 1795, 1795, NA, 1850), `Median of Daily Rental Listings Available - AMH Orlando` = c(82,
91.5, NA, NA, 106, 119, 117, NA, 105), `M-o-M Change in Median Rent - AMH Phoenix` = c(0,
NA, 0.0216172938350681, NA, 0.0258620689655173, -0.0252100840336135,
0.0344827586206897, 0, NA), `Median Advertised Rent - AMH Phoenix` = c(1561.25,
NA, 1595, NA, 1636.25, 1595, 1650, 1650, NA), `Median of Daily Rental Listings Available - AMH Phoenix` = c(130,
NA, 127, NA, 129, 131, 97, 85, NA), `M-o-M Change in Median Rent - AMH Raleigh` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0.0344827586206897
), `Median Advertised Rent - AMH Raleigh` = c(1550, NA, 1595,
NA, 1595, 1595, 1595, NA, 1650), `Median of Daily Rental Listings Available - AMH Raleigh` = c(91,
NA, 104, NA, 114, 142, 90, NA, 81), `M-o-M Change in Median Rent - AMH SoFla` = c(0,
-0.00869565217391299, NA, NA, 0.0233918128654971, -0.0314285714285715,
0.0162241887905605, NA, 0.0159651669085632), `Median Advertised Rent - AMH SoFla` = c(1725,
1710, NA, NA, 1750, 1695, 1722.5, NA, 1750), `Median of Daily Rental Listings Available - AMH SoFla` = c(11,
14, NA, NA, 11, 10, 7, NA, 7), `M-o-M Change in Median Rent - AMH Winston-Salem/Greensboro` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Winston-Salem/Greensboro` = c(1550,
NA, 1595, NA, 1595, 1595, 1595, NA, 1595), `Median of Daily Rental Listings Available - AMH Winston-Salem/Greensboro` = c(66,
NA, 68, NA, 72, 73, 71, NA, 63)), class = "data.frame", row.names = c("2020-01-31",
"2020-02-28", "2020-02-29", "2020-03-30", "2020-03-31", "2020-04-30",
"2020-05-31", "2020-06-14", "2020-06-15"))
以下是我尝试过的两种方法的示例:
test <- AMH_final_Monthly3 %>% mutate(month= month(Date), year=year(Date))
test2 <- AMH_final_Monthly3 %>%
collapse_by("monthly") %>%
dplyr::group_by(Date, add = TRUE)
test3 <- as.yearmon(AMH_final_Monthly3)
感谢帮助!
像下面这样的东西?我的设置是荷兰语,因此有奇怪的月份名称。但是我使用 tsibble 中的 yearmon
函数,通过 yearmonth 变量摆脱日期列组,并使用 sum
汇总所有变量。现在您还可以使用 first
、last
或您想要的任何其他函数。
AMH_final_Monthly3 %>%
mutate(yearmonth = tsibble::yearmonth(Date)) %>%
select(-Date) %>%
group_by(yearmonth) %>%
summarise_all(.funs = "sum", na.rm = TRUE)
# A tibble: 6 x 37
yearmonth `M-o-M Change i~ `Median Adverti~ `Median of Dail~ `M-o-M Change i~ `Median Adverti~ `Median of Dail~ `M-o-M Change i~
<mth> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2020 jan 0 1695 438 0 1650 244 0
2 2020 feb 0 1695 430. 0 1650 257 0
3 2020 mrt 0 1695 450 0 1650 256 0
4 2020 apr 0 1695 458 0.0273 1695 270 0
5 2020 mei 0 1695 385 0 1695 227 0.0306
6 2020 jun 0 1695 331 0.0324 1750 220 0
# ... with 29 more variables:
我有一个 data.frame,它有一个日期索引,它在第一列中被复制,并且有许多不同的列和包含数据的相应行。因为包含数据(按日期编制索引)的行可能会有所不同,具体取决于是否收集了相应的数据,有时可能会有一行(日期)包含许多空白并且只有几列的值。我想将这些行折叠成一个月和一年。
即一行可以显示昨天的数据,但不是今天所有列的数据,因为它可能尚未收集。只是看起来美观混乱,宁愿只说 "June-2020" 并折叠它们以删除 NA。
这是 data.frame 的输入:
structure(list(Date = structure(c(18292, 18320, 18321, 18351,
18352, 18382, 18413, 18427, 18428), tzone = "UTC", tclass = "Date", class = "Date"),
`M-o-M Change in Median Rent - AMH GA` = c(0, NA, 0, NA,
0, 0, 0, NA, 0), `Median Advertised Rent - AMH GA` = c(1695,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH GA` = c(438,
NA, 430.5, NA, 450, 458, 385, NA, 331), `M-o-M Change in Median Rent - AMH Charlotte` = c(0,
NA, 0, NA, 0, 0.0272727272727273, 0, NA, 0.0324483775811208
), `Median Advertised Rent - AMH Charlotte` = c(1650, NA,
1650, NA, 1650, 1695, 1695, NA, 1750), `Median of Daily Rental Listings Available - AMH Charlotte` = c(244,
NA, 257, NA, 256, 270, 227, NA, 220), `M-o-M Change in Median Rent - AMH Dallas` = c(0,
NA, 0, NA, 0, 0, 0.0306406685236769, NA, 0), `Median Advertised Rent - AMH Dallas` = c(1795,
NA, 1795, NA, 1795, 1795, 1850, NA, 1850), `Median of Daily Rental Listings Available - AMH Dallas` = c(148,
NA, 150, NA, 166, 152.5, 131, NA, 135), `M-o-M Change in Median Rent - AMH Houston` = c(0,
NA, 0.0272727272727273, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Houston` = c(1650,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Houston` = c(222,
NA, 223, NA, 228, 237.5, 203, NA, 189), `M-o-M Change in Median Rent - AMH Jacksonville` = c(0,
0.00681818181818183, NA, NA, -0.00677200902934538, 0, 0.0272727272727273,
NA, 0), `Median Advertised Rent - AMH Jacksonville` = c(1650,
1661.25, NA, NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Jacksonville` = c(164,
179.5, NA, NA, 188, 195.5, 185, NA, 174), `M-o-M Change in Median Rent - AMH NC` = c(0,
NA, 0.0344827586206897, NA, 0, 0, 0.0272727272727273, NA,
0), `Median Advertised Rent - AMH NC` = c(1595, NA, 1650,
NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH NC` = c(365,
NA, 387, NA, 405, 447, 344, NA, 323), `M-o-M Change in Median Rent - AMH NV` = c(0,
NA, 0, 0, NA, -0.0265486725663717, 0, NA, 0.0272727272727273
), `Median Advertised Rent - AMH NV` = c(1695, NA, 1695,
1695, NA, 1650, 1650, NA, 1695), `Median of Daily Rental Listings Available - AMH NV` = c(62,
NA, 59, 70, NA, 71, 63, NA, 58), `M-o-M Change in Median Rent - AMH Orlando` = c(0,
0.0112676056338028, NA, NA, 0, 0, 0, NA, 0.0306406685236769
), `Median Advertised Rent - AMH Orlando` = c(1775, 1795,
NA, NA, 1795, 1795, 1795, NA, 1850), `Median of Daily Rental Listings Available - AMH Orlando` = c(82,
91.5, NA, NA, 106, 119, 117, NA, 105), `M-o-M Change in Median Rent - AMH Phoenix` = c(0,
NA, 0.0216172938350681, NA, 0.0258620689655173, -0.0252100840336135,
0.0344827586206897, 0, NA), `Median Advertised Rent - AMH Phoenix` = c(1561.25,
NA, 1595, NA, 1636.25, 1595, 1650, 1650, NA), `Median of Daily Rental Listings Available - AMH Phoenix` = c(130,
NA, 127, NA, 129, 131, 97, 85, NA), `M-o-M Change in Median Rent - AMH Raleigh` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0.0344827586206897
), `Median Advertised Rent - AMH Raleigh` = c(1550, NA, 1595,
NA, 1595, 1595, 1595, NA, 1650), `Median of Daily Rental Listings Available - AMH Raleigh` = c(91,
NA, 104, NA, 114, 142, 90, NA, 81), `M-o-M Change in Median Rent - AMH SoFla` = c(0,
-0.00869565217391299, NA, NA, 0.0233918128654971, -0.0314285714285715,
0.0162241887905605, NA, 0.0159651669085632), `Median Advertised Rent - AMH SoFla` = c(1725,
1710, NA, NA, 1750, 1695, 1722.5, NA, 1750), `Median of Daily Rental Listings Available - AMH SoFla` = c(11,
14, NA, NA, 11, 10, 7, NA, 7), `M-o-M Change in Median Rent - AMH Winston-Salem/Greensboro` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Winston-Salem/Greensboro` = c(1550,
NA, 1595, NA, 1595, 1595, 1595, NA, 1595), `Median of Daily Rental Listings Available - AMH Winston-Salem/Greensboro` = c(66,
NA, 68, NA, 72, 73, 71, NA, 63)), class = "data.frame", row.names = c("2020-01-31",
"2020-02-28", "2020-02-29", "2020-03-30", "2020-03-31", "2020-04-30",
"2020-05-31", "2020-06-14", "2020-06-15"))
以下是我尝试过的两种方法的示例:
test <- AMH_final_Monthly3 %>% mutate(month= month(Date), year=year(Date))
test2 <- AMH_final_Monthly3 %>%
collapse_by("monthly") %>%
dplyr::group_by(Date, add = TRUE)
test3 <- as.yearmon(AMH_final_Monthly3)
感谢帮助!
像下面这样的东西?我的设置是荷兰语,因此有奇怪的月份名称。但是我使用 tsibble 中的 yearmon
函数,通过 yearmonth 变量摆脱日期列组,并使用 sum
汇总所有变量。现在您还可以使用 first
、last
或您想要的任何其他函数。
AMH_final_Monthly3 %>%
mutate(yearmonth = tsibble::yearmonth(Date)) %>%
select(-Date) %>%
group_by(yearmonth) %>%
summarise_all(.funs = "sum", na.rm = TRUE)
# A tibble: 6 x 37
yearmonth `M-o-M Change i~ `Median Adverti~ `Median of Dail~ `M-o-M Change i~ `Median Adverti~ `Median of Dail~ `M-o-M Change i~
<mth> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2020 jan 0 1695 438 0 1650 244 0
2 2020 feb 0 1695 430. 0 1650 257 0
3 2020 mrt 0 1695 450 0 1650 256 0
4 2020 apr 0 1695 458 0.0273 1695 270 0
5 2020 mei 0 1695 385 0 1695 227 0.0306
6 2020 jun 0 1695 331 0.0324 1750 220 0
# ... with 29 more variables: