每天汇总不同日期的借方和贷方金额并按账户分组
Aggregate debit and credit amounts with different dates on a daily basis and group by accounts
我有一个 table,其中包含借方金额、贷方金额、借方日期、贷方日期和帐户 ID。只要有借方金额条目,贷方金额将为空,反之亦然。我需要每天的借方和贷方总和。
id
Debit_date
Debit_amount
Credit_date
Credit_amount
1
2018-10-21
20000
NA
NA
1
NA
NA
2018-10-21
50000
2
2019-1-2
10000
NA
NA
2
2019-1-3
20000
NA
NA
4
NA
NA
2019-1-4
30000
1
2019-1-5
1000
NA
NA
我需要得到以下输出:
id
Trans_date
Total_debit
Total_credit
1
2018-10-21
20000
50000
1
2019-1-5
1000
NA
2
2019-1-2
30000
NA
4
2019-1-4
NA
30000
我尝试了以下代码:
df_db = df %>% group_by(id,debit_date) %>% summarise(total_debit=sum(debit_amount))
df_cr = df %>% group_by(id,credit_date) %>% summarise(total_credit=sum(credit_amount))
然后我继续加入这两个数据框,但它只是炸毁了它,因为我有数百万笔交易。谁能指导我如何获取上面输出中的数据。非常感谢。
您可以使用 coalesce
按日期分组:
df %>%
group_by(id, Trans_date = coalesce(Debit_date, Credit_date)) %>%
summarise(Total_debit = sum(Debit_amount, na.rm = T),
Total_credit = sum(Credit_amount, na.rm = T))
id Trans_date Total_debit Total_credit
1 1 2018-10-21 20000 50000
2 1 2019-1-5 1000 0
3 2 2019-1-2 30000 0
4 4 2019-1-4 0 30000
数据(我调整了第五行的 Date
以匹配预期输出)
structure(list(id = c(1L, 1L, 2L, 2L, 4L, 1L), Debit_date = c("2018-10-21",
NA, "2019-1-2", "2019-1-2", NA, "2019-1-5"), Debit_amount = c(20000L,
NA, 10000L, 20000L, NA, 1000L), Credit_date = c(NA, "2018-10-21",
NA, NA, "2019-1-4", NA), Credit_amount = c(NA, 50000L, NA, NA,
30000L, NA)), class = "data.frame", row.names = c(NA, -6L))
我有一个 table,其中包含借方金额、贷方金额、借方日期、贷方日期和帐户 ID。只要有借方金额条目,贷方金额将为空,反之亦然。我需要每天的借方和贷方总和。
id | Debit_date | Debit_amount | Credit_date | Credit_amount |
---|---|---|---|---|
1 | 2018-10-21 | 20000 | NA | NA |
1 | NA | NA | 2018-10-21 | 50000 |
2 | 2019-1-2 | 10000 | NA | NA |
2 | 2019-1-3 | 20000 | NA | NA |
4 | NA | NA | 2019-1-4 | 30000 |
1 | 2019-1-5 | 1000 | NA | NA |
我需要得到以下输出:
id | Trans_date | Total_debit | Total_credit |
---|---|---|---|
1 | 2018-10-21 | 20000 | 50000 |
1 | 2019-1-5 | 1000 | NA |
2 | 2019-1-2 | 30000 | NA |
4 | 2019-1-4 | NA | 30000 |
我尝试了以下代码:
df_db = df %>% group_by(id,debit_date) %>% summarise(total_debit=sum(debit_amount))
df_cr = df %>% group_by(id,credit_date) %>% summarise(total_credit=sum(credit_amount))
然后我继续加入这两个数据框,但它只是炸毁了它,因为我有数百万笔交易。谁能指导我如何获取上面输出中的数据。非常感谢。
您可以使用 coalesce
按日期分组:
df %>%
group_by(id, Trans_date = coalesce(Debit_date, Credit_date)) %>%
summarise(Total_debit = sum(Debit_amount, na.rm = T),
Total_credit = sum(Credit_amount, na.rm = T))
id Trans_date Total_debit Total_credit
1 1 2018-10-21 20000 50000
2 1 2019-1-5 1000 0
3 2 2019-1-2 30000 0
4 4 2019-1-4 0 30000
数据(我调整了第五行的 Date
以匹配预期输出)
structure(list(id = c(1L, 1L, 2L, 2L, 4L, 1L), Debit_date = c("2018-10-21",
NA, "2019-1-2", "2019-1-2", NA, "2019-1-5"), Debit_amount = c(20000L,
NA, 10000L, 20000L, NA, 1000L), Credit_date = c(NA, "2018-10-21",
NA, NA, "2019-1-4", NA), Credit_amount = c(NA, 50000L, NA, NA,
30000L, NA)), class = "data.frame", row.names = c(NA, -6L))