Remove/move 行在其他数据框中具有匹配值
Remove/move rows that have matching values in an other dataframe
我目前正在处理发生某些系统错误后的产品退款问题。
我有一个巨大的 xlxs 列表 (table 1),其中包含几周内所有待处理的退款。但是,table 中的一些记录已被手动退款并存储在单独的文件中 (table 2)。
这是我的数据集的示例:
全部退款table:
number
ordernumber
Amount
Status
1
123456789
150.50
pending
2
235641458
250.30
pending
3
235984258
50.20
pending
4
283478566
102.45
pending
人工退款table
number
ordernumber
Amount
Status
1
123456789
150.50
refunded
2
235641458
250.30
refunded
我现在想要做的是删除(最好移动到单独的 table)'all refunds table' 中的行,只要订单号与 'manually refunded table' 中的订单号匹配。你们能帮帮我吗?
谢谢!
你可以试试
all_refund <- read.table(text = "number ordernumber Amount Status
1 123456789 150.50 pending
2 235641458 250.30 pending
3 235984258 50.20 pending
4 283478566 102.45 pending", header = T)
manually <- read.table(text = "number ordernumber Amount Status
1 123456789 150.50 refunded
2 235641458 250.30 refunded", header = T)
all_refund[!(all_refund$ordernumber %in% manually$ordernumber),]
number ordernumber Amount Status
3 3 235984258 50.20 pending
4 4 283478566 102.45 pending
使用 base
R,您可以使用以下代码对匹配的订单号进行子集化:
all_refunds <- data.frame(
ordernumber = c(123456789, 235641458, 235984258, 283478566),
amount = c(150.50, 250.30, 50.20, 102.45),
status = rep("pending", 4)
)
manual_refunds <- data.frame(
ordernumber = c(123456789, 235641458),
amount = c(150.50, 250.30)
)
matching <- all_refunds$ordernumber %in% manual_refunds$ordernumber #Find matching ordernumbers.
然后您可以创建一个包含匹配行的新 table 并从 all_refunds table 中删除行,如下所示:
registered_refunds <- all_refunds[matching, ] #Select only matching from all_refunds
all_refunds <- all_refunds[!matching, ] #Select rows that DO NOT match in all_refunds and reassign the table.
给出输出:
>all_refunds
ordernumber amount status
3 235984258 50.20 pending
4 283478566 102.45 pending
> registered_refunds
ordernumber amount status
1 123456789 150.5 pending
2 235641458 250.3 pending
这也称为“过滤连接”,请参见https://dplyr.tidyverse.org/reference/filter-joins.html
我目前正在处理发生某些系统错误后的产品退款问题。 我有一个巨大的 xlxs 列表 (table 1),其中包含几周内所有待处理的退款。但是,table 中的一些记录已被手动退款并存储在单独的文件中 (table 2)。
这是我的数据集的示例:
全部退款table:
number | ordernumber | Amount | Status |
---|---|---|---|
1 | 123456789 | 150.50 | pending |
2 | 235641458 | 250.30 | pending |
3 | 235984258 | 50.20 | pending |
4 | 283478566 | 102.45 | pending |
人工退款table
number | ordernumber | Amount | Status |
---|---|---|---|
1 | 123456789 | 150.50 | refunded |
2 | 235641458 | 250.30 | refunded |
我现在想要做的是删除(最好移动到单独的 table)'all refunds table' 中的行,只要订单号与 'manually refunded table' 中的订单号匹配。你们能帮帮我吗?
谢谢!
你可以试试
all_refund <- read.table(text = "number ordernumber Amount Status
1 123456789 150.50 pending
2 235641458 250.30 pending
3 235984258 50.20 pending
4 283478566 102.45 pending", header = T)
manually <- read.table(text = "number ordernumber Amount Status
1 123456789 150.50 refunded
2 235641458 250.30 refunded", header = T)
all_refund[!(all_refund$ordernumber %in% manually$ordernumber),]
number ordernumber Amount Status
3 3 235984258 50.20 pending
4 4 283478566 102.45 pending
使用 base
R,您可以使用以下代码对匹配的订单号进行子集化:
all_refunds <- data.frame(
ordernumber = c(123456789, 235641458, 235984258, 283478566),
amount = c(150.50, 250.30, 50.20, 102.45),
status = rep("pending", 4)
)
manual_refunds <- data.frame(
ordernumber = c(123456789, 235641458),
amount = c(150.50, 250.30)
)
matching <- all_refunds$ordernumber %in% manual_refunds$ordernumber #Find matching ordernumbers.
然后您可以创建一个包含匹配行的新 table 并从 all_refunds table 中删除行,如下所示:
registered_refunds <- all_refunds[matching, ] #Select only matching from all_refunds
all_refunds <- all_refunds[!matching, ] #Select rows that DO NOT match in all_refunds and reassign the table.
给出输出:
>all_refunds
ordernumber amount status
3 235984258 50.20 pending
4 283478566 102.45 pending
> registered_refunds
ordernumber amount status
1 123456789 150.5 pending
2 235641458 250.3 pending
这也称为“过滤连接”,请参见https://dplyr.tidyverse.org/reference/filter-joins.html