如何使用 R 有条件地将行插入到数据框中?
How to conditionally insert rows into a data frame with R?
我正在尝试根据变异列 (Day
) 的 Sys.Date()
是否为 Tue
来有条件地插入行。如果是这样,我想插入前两天在 MaxDate
中列出的行。如果 Day
列不是 Tue
那么我只想让数据框保持原样。我不认为你可以在数据框上使用 if_else()
并且我不确定如何去做。也许以某种方式使用 add_row()
?
这是我的:
ID
Product
MaxDate
Day
100
candy
2022-01-18
Tue
100
chips
2022-01-18
Tue
101
candy
2022-01-18
Tue
101
chips
2022-01-18
Tue
102
candy
2022-01-18
Tue
103
candy
2022-01-13
Tue
103
chips
2022-01-13
Tue
如果是星期二,这就是我想要的:
ID
Product
MaxDate
Day
100
candy
2022-01-16
Tue
100
chips
2022-01-16
Tue
100
candy
2022-01-17
Tue
100
chips
2022-01-17
Tue
100
candy
2022-01-18
Tue
100
chips
2022-01-18
Tue
101
candy
2022-01-16
Tue
101
chips
2022-01-16
Tue
101
candy
2022-01-17
Tue
101
chips
2022-01-17
Tue
101
candy
2022-01-18
Tue
101
chips
2022-01-18
Tue
102
candy
2022-01-16
Tue
102
candy
2022-01-17
Tue
102
candy
2022-01-18
Tue
103
candy
2022-01-16
Tue
103
chips
2022-01-16
Tue
103
candy
2022-01-17
Tue
103
chips
2022-01-17
Tue
103
candy
2022-01-13
Tue
103
chips
2022-01-13
Tue
如果不是Tue
我希望数据框不变Tue
:
ID
Product
MaxDate
Day
100
candy
2022-01-17
Mon
100
chips
2022-01-17
Mon
101
candy
2022-01-17
Mon
101
chips
2022-01-17
Mon
102
candy
2022-01-17
Mon
103
candy
2022-01-13
Mon
103
chips
2022-01-13
Mon
谢谢。
如果您需要概括这一点,可能有更优雅的方法,但这种方法很快并且可以完成工作:
bind_rows(
df,
df %>% filter(Day == "Tue") %>% mutate(MaxDate = MaxDate - 1),
df %>% filter(Day == "Tue") %>% mutate(MaxDate = MaxDate - 2)
) %>%
arrange(ID, MaxDate, Product)
# ID Product MaxDate Day
# 1 100 candy 2022-01-16 Tue
# 2 100 chips 2022-01-16 Tue
# 3 100 candy 2022-01-17 Tue
# 4 100 chips 2022-01-17 Tue
# 5 100 candy 2022-01-18 Tue
# 6 100 chips 2022-01-18 Tue
# 7 101 candy 2022-01-16 Tue
# 8 101 chips 2022-01-16 Tue
# 9 101 candy 2022-01-17 Tue
# 10 101 chips 2022-01-17 Tue
# 11 101 candy 2022-01-18 Tue
# 12 101 chips 2022-01-18 Tue
# 13 102 candy 2022-01-16 Tue
# 14 102 candy 2022-01-17 Tue
# 15 102 candy 2022-01-18 Tue
# 16 103 candy 2022-01-11 Tue
# 17 103 chips 2022-01-11 Tue
# 18 103 candy 2022-01-12 Tue
# 19 103 chips 2022-01-12 Tue
# 20 103 candy 2022-01-13 Tue
# 21 103 chips 2022-01-13 Tue
使用这个可重现的数据:
df = read.table(text = 'ID Product MaxDate Day
100 candy 2022-01-18 Tue
100 chips 2022-01-18 Tue
101 candy 2022-01-18 Tue
101 chips 2022-01-18 Tue
102 candy 2022-01-18 Tue
103 candy 2022-01-13 Tue
103 chips 2022-01-13 Tue', header = T) %>%
mutate(MaxDate = as.Date(MaxDate))
library(dplyr, warn.conflicts = FALSE)
df = read.table(text = 'ID Product MaxDate Day
100 candy 2022-01-18 Tue
100 chips 2022-01-18 Tue
101 candy 2022-01-18 Tue
101 chips 2022-01-18 Tue
102 candy 2022-01-18 Tue
103 candy 2022-01-13 Wed
103 chips 2022-01-13 Tue', header = T) %>%
mutate(MaxDate = as.Date(MaxDate))
df %>%
left_join(tibble(Day = 'Tue', lagged_days = 2:0)) %>%
mutate(MaxDate = MaxDate - coalesce(lagged_days, 0),
lagged_days = NULL)
#> Joining, by = "Day"
#> ID Product MaxDate Day
#> 1 100 candy 2022-01-16 Tue
#> 2 100 candy 2022-01-17 Tue
#> 3 100 candy 2022-01-18 Tue
#> 4 100 chips 2022-01-16 Tue
#> 5 100 chips 2022-01-17 Tue
#> 6 100 chips 2022-01-18 Tue
#> 7 101 candy 2022-01-16 Tue
#> 8 101 candy 2022-01-17 Tue
#> 9 101 candy 2022-01-18 Tue
#> 10 101 chips 2022-01-16 Tue
#> 11 101 chips 2022-01-17 Tue
#> 12 101 chips 2022-01-18 Tue
#> 13 102 candy 2022-01-16 Tue
#> 14 102 candy 2022-01-17 Tue
#> 15 102 candy 2022-01-18 Tue
#> 16 103 candy 2022-01-13 Wed
#> 17 103 chips 2022-01-11 Tue
#> 18 103 chips 2022-01-12 Tue
#> 19 103 chips 2022-01-13 Tue
由 reprex package (v2.0.1)
于 2022-01-18 创建
我正在尝试根据变异列 (Day
) 的 Sys.Date()
是否为 Tue
来有条件地插入行。如果是这样,我想插入前两天在 MaxDate
中列出的行。如果 Day
列不是 Tue
那么我只想让数据框保持原样。我不认为你可以在数据框上使用 if_else()
并且我不确定如何去做。也许以某种方式使用 add_row()
?
这是我的:
ID | Product | MaxDate | Day |
---|---|---|---|
100 | candy | 2022-01-18 | Tue |
100 | chips | 2022-01-18 | Tue |
101 | candy | 2022-01-18 | Tue |
101 | chips | 2022-01-18 | Tue |
102 | candy | 2022-01-18 | Tue |
103 | candy | 2022-01-13 | Tue |
103 | chips | 2022-01-13 | Tue |
如果是星期二,这就是我想要的:
ID | Product | MaxDate | Day |
---|---|---|---|
100 | candy | 2022-01-16 | Tue |
100 | chips | 2022-01-16 | Tue |
100 | candy | 2022-01-17 | Tue |
100 | chips | 2022-01-17 | Tue |
100 | candy | 2022-01-18 | Tue |
100 | chips | 2022-01-18 | Tue |
101 | candy | 2022-01-16 | Tue |
101 | chips | 2022-01-16 | Tue |
101 | candy | 2022-01-17 | Tue |
101 | chips | 2022-01-17 | Tue |
101 | candy | 2022-01-18 | Tue |
101 | chips | 2022-01-18 | Tue |
102 | candy | 2022-01-16 | Tue |
102 | candy | 2022-01-17 | Tue |
102 | candy | 2022-01-18 | Tue |
103 | candy | 2022-01-16 | Tue |
103 | chips | 2022-01-16 | Tue |
103 | candy | 2022-01-17 | Tue |
103 | chips | 2022-01-17 | Tue |
103 | candy | 2022-01-13 | Tue |
103 | chips | 2022-01-13 | Tue |
如果不是Tue
我希望数据框不变Tue
:
ID | Product | MaxDate | Day |
---|---|---|---|
100 | candy | 2022-01-17 | Mon |
100 | chips | 2022-01-17 | Mon |
101 | candy | 2022-01-17 | Mon |
101 | chips | 2022-01-17 | Mon |
102 | candy | 2022-01-17 | Mon |
103 | candy | 2022-01-13 | Mon |
103 | chips | 2022-01-13 | Mon |
谢谢。
如果您需要概括这一点,可能有更优雅的方法,但这种方法很快并且可以完成工作:
bind_rows(
df,
df %>% filter(Day == "Tue") %>% mutate(MaxDate = MaxDate - 1),
df %>% filter(Day == "Tue") %>% mutate(MaxDate = MaxDate - 2)
) %>%
arrange(ID, MaxDate, Product)
# ID Product MaxDate Day
# 1 100 candy 2022-01-16 Tue
# 2 100 chips 2022-01-16 Tue
# 3 100 candy 2022-01-17 Tue
# 4 100 chips 2022-01-17 Tue
# 5 100 candy 2022-01-18 Tue
# 6 100 chips 2022-01-18 Tue
# 7 101 candy 2022-01-16 Tue
# 8 101 chips 2022-01-16 Tue
# 9 101 candy 2022-01-17 Tue
# 10 101 chips 2022-01-17 Tue
# 11 101 candy 2022-01-18 Tue
# 12 101 chips 2022-01-18 Tue
# 13 102 candy 2022-01-16 Tue
# 14 102 candy 2022-01-17 Tue
# 15 102 candy 2022-01-18 Tue
# 16 103 candy 2022-01-11 Tue
# 17 103 chips 2022-01-11 Tue
# 18 103 candy 2022-01-12 Tue
# 19 103 chips 2022-01-12 Tue
# 20 103 candy 2022-01-13 Tue
# 21 103 chips 2022-01-13 Tue
使用这个可重现的数据:
df = read.table(text = 'ID Product MaxDate Day
100 candy 2022-01-18 Tue
100 chips 2022-01-18 Tue
101 candy 2022-01-18 Tue
101 chips 2022-01-18 Tue
102 candy 2022-01-18 Tue
103 candy 2022-01-13 Tue
103 chips 2022-01-13 Tue', header = T) %>%
mutate(MaxDate = as.Date(MaxDate))
library(dplyr, warn.conflicts = FALSE)
df = read.table(text = 'ID Product MaxDate Day
100 candy 2022-01-18 Tue
100 chips 2022-01-18 Tue
101 candy 2022-01-18 Tue
101 chips 2022-01-18 Tue
102 candy 2022-01-18 Tue
103 candy 2022-01-13 Wed
103 chips 2022-01-13 Tue', header = T) %>%
mutate(MaxDate = as.Date(MaxDate))
df %>%
left_join(tibble(Day = 'Tue', lagged_days = 2:0)) %>%
mutate(MaxDate = MaxDate - coalesce(lagged_days, 0),
lagged_days = NULL)
#> Joining, by = "Day"
#> ID Product MaxDate Day
#> 1 100 candy 2022-01-16 Tue
#> 2 100 candy 2022-01-17 Tue
#> 3 100 candy 2022-01-18 Tue
#> 4 100 chips 2022-01-16 Tue
#> 5 100 chips 2022-01-17 Tue
#> 6 100 chips 2022-01-18 Tue
#> 7 101 candy 2022-01-16 Tue
#> 8 101 candy 2022-01-17 Tue
#> 9 101 candy 2022-01-18 Tue
#> 10 101 chips 2022-01-16 Tue
#> 11 101 chips 2022-01-17 Tue
#> 12 101 chips 2022-01-18 Tue
#> 13 102 candy 2022-01-16 Tue
#> 14 102 candy 2022-01-17 Tue
#> 15 102 candy 2022-01-18 Tue
#> 16 103 candy 2022-01-13 Wed
#> 17 103 chips 2022-01-11 Tue
#> 18 103 chips 2022-01-12 Tue
#> 19 103 chips 2022-01-13 Tue
由 reprex package (v2.0.1)
于 2022-01-18 创建