日期:尚未为此非数字和非字符类型实现 NABounds=TRUE
dates: Not yet implemented NAbounds=TRUE for this non-numeric and non-character type
我有这个数据框:
df1 <- structure(list(ID = c(1, 2, 2, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9,
10), dateA = structure(c(14974, 18628, 18628, 18628, 14882, 16800,
14882, 17835, 17835, 16832, 16556, 16556, 15949, 16801), class = "Date"),
dateB = structure(c(14610, 15340, 15706, 17501, 14730, NA,
14700, 16191, 17106, 16801, 15810, 16436, 14655, 15431), class = "Date"),
dateC = structure(c(18628, 15705, 17500, 18628, 18628, NA,
18628, 17105, 18628, 18628, 16435, 16556, 15706, 18628), class = "Date")), row.names = c(NA,
-14L), class = c("data.table", "data.frame"))
ID dateA dateB dateC
1: 1 2010-12-31 2010-01-01 2021-01-01
2: 2 2021-01-01 2012-01-01 2012-12-31
3: 2 2021-01-01 2013-01-01 2017-11-30
4: 2 2021-01-01 2017-12-01 2021-01-01
5: 3 2010-09-30 2010-05-01 2021-01-01
6: 4 2015-12-31 <NA> <NA>
7: 5 2010-09-30 2010-04-01 2021-01-01
8: 6 2018-10-31 2014-05-01 2016-10-31
9: 6 2018-10-31 2016-11-01 2021-01-01
10: 7 2016-02-01 2016-01-01 2021-01-01
11: 8 2015-05-01 2013-04-15 2014-12-31
12: 8 2015-05-01 2015-01-01 2015-05-01
13: 9 2013-09-01 2010-02-15 2013-01-01
14: 10 2016-01-01 2012-04-01 2021-01-01
我想检查dateA是否在dateB和dateC的区间内:
我的代码:
library(dplyr)
df1 %>%
mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))
给出:
Error: Problem with `mutate()` column `match`.
i `match = ifelse(between(dateA, dateB, dateC), 1, 0)`.
x Not yet implemented NAbounds=TRUE for this non-numeric and non-character type
如果我删除包含 NA
的行,代码将起作用:
df1 %>%
slice(-6) %>%
mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))
我想知道,我可以离开 NA
行并执行我的代码吗?
关于 OP 使用的 between
存在混淆,因为输入对象是 data.table
并且使用的代码是 dplyr
。因此,如果我们假设两个包都已加载,则每个包中都有一个 between
函数,并且根据最后加载的包,前一个包中的 between
将被屏蔽。如果使用 dplyr::between
,它没有完全向量化,它被记录在 ?dplyr::between
left, right Boundary values (must be scalars).
df1 %>%
rowwise %>%
mutate(match = +(dplyr::between(dateA, dateB, dateC))) %>%
ungroup
-输出
# A tibble: 14 × 5
ID dateA dateB dateC match
<dbl> <date> <date> <date> <int>
1 1 2010-12-31 2010-01-01 2021-01-01 1
2 2 2021-01-01 2012-01-01 2012-12-31 0
3 2 2021-01-01 2013-01-01 2017-11-30 0
4 2 2021-01-01 2017-12-01 2021-01-01 1
5 3 2010-09-30 2010-05-01 2021-01-01 1
6 4 2015-12-31 NA NA NA
7 5 2010-09-30 2010-04-01 2021-01-01 1
8 6 2018-10-31 2014-05-01 2016-10-31 0
9 6 2018-10-31 2016-11-01 2021-01-01 1
10 7 2016-02-01 2016-01-01 2021-01-01 1
11 8 2015-05-01 2013-04-15 2014-12-31 0
12 8 2015-05-01 2015-01-01 2015-05-01 1
13 9 2013-09-01 2010-02-15 2013-01-01 0
14 10 2016-01-01 2012-04-01 2021-01-01 1
然而,?data.table::between
并非如此(根据 OP 的 post 中显示的错误,似乎使用的 between
来自 data.table
,
lower - Lower range bound. Either length 1 or same length as x.
upper - Upper range bound. Either length 1 or same length as x.
但是 class
可能是个问题,尽管它另有说明
x- Any orderable vector, i.e., those with relevant methods for <=
, such as numeric, character, Date, etc. in case of between and a numeric vector in case of inrange.
从 Date
class 转换为 integer/numeric
应该可以工作
df1 %>%
mutate(match = +(data.table::between(as.numeric(dateA),
as.numeric(dateB), as.numeric(dateC))))
-输出
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> 1
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
通过深入研究,问题出在参数 NAbounds
中,默认情况下是 TRUE
。在 OP 的数据中,有一个 NA
元素
df1 %>%
mutate(match = data.table::between(dateA, dateB, dateC))
Error: Problem with mutate()
column match
.
ℹ match = data.table::between(dateA, dateB, dateC)
.
✖ Not yet implemented NAbounds=TRUE for this non-numeric and non-character type
Run rlang::last_error()
to see where the error occurred.
我们可能需要将其设置为 FALSE
df1 %>%
mutate(match = +(data.table::between(dateA, dateB, dateC, NAbounds = FALSE)))
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> NA
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
或者也可以用 as.Date
对 NA
进行转换
df1 %>%
mutate(match = +(data.table::between(dateA, dateB, dateC,
NAbounds = as.Date(NA))))
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> NA
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
library(tidyverse)
library(lubridate)
df1 %>%
mutate(res = +(dateA %within% interval(dateB, dateC)))
#> ID dateA dateB dateC res
#> 1 1 2010-12-31 2010-01-01 2021-01-01 1
#> 2 2 2021-01-01 2012-01-01 2012-12-31 0
#> 3 2 2021-01-01 2013-01-01 2017-11-30 0
#> 4 2 2021-01-01 2017-12-01 2021-01-01 1
#> 5 3 2010-09-30 2010-05-01 2021-01-01 1
#> 6 4 2015-12-31 <NA> <NA> NA
#> 7 5 2010-09-30 2010-04-01 2021-01-01 1
#> 8 6 2018-10-31 2014-05-01 2016-10-31 0
#> 9 6 2018-10-31 2016-11-01 2021-01-01 1
#> 10 7 2016-02-01 2016-01-01 2021-01-01 1
#> 11 8 2015-05-01 2013-04-15 2014-12-31 0
#> 12 8 2015-05-01 2015-01-01 2015-05-01 1
#> 13 9 2013-09-01 2010-02-15 2013-01-01 0
#> 14 10 2016-01-01 2012-04-01 2021-01-01 1
数据
df1 <- structure(
list(
ID = c(1, 2, 2, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9,
10),
dateA = structure(
c(
14974,
18628,
18628,
18628,
14882,
16800,
14882,
17835,
17835,
16832,
16556,
16556,
15949,
16801
),
class = "Date"
),
dateB = structure(
c(
14610,
15340,
15706,
17501,
14730,
NA,
14700,
16191,
17106,
16801,
15810,
16436,
14655,
15431
),
class = "Date"
),
dateC = structure(
c(
18628,
15705,
17500,
18628,
18628,
NA,
18628,
17105,
18628,
18628,
16435,
16556,
15706,
18628
),
class = "Date"
)
),
row.names = c(NA,-14L),
class = c("data.table", "data.frame")
)
我有这个数据框:
df1 <- structure(list(ID = c(1, 2, 2, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9,
10), dateA = structure(c(14974, 18628, 18628, 18628, 14882, 16800,
14882, 17835, 17835, 16832, 16556, 16556, 15949, 16801), class = "Date"),
dateB = structure(c(14610, 15340, 15706, 17501, 14730, NA,
14700, 16191, 17106, 16801, 15810, 16436, 14655, 15431), class = "Date"),
dateC = structure(c(18628, 15705, 17500, 18628, 18628, NA,
18628, 17105, 18628, 18628, 16435, 16556, 15706, 18628), class = "Date")), row.names = c(NA,
-14L), class = c("data.table", "data.frame"))
ID dateA dateB dateC
1: 1 2010-12-31 2010-01-01 2021-01-01
2: 2 2021-01-01 2012-01-01 2012-12-31
3: 2 2021-01-01 2013-01-01 2017-11-30
4: 2 2021-01-01 2017-12-01 2021-01-01
5: 3 2010-09-30 2010-05-01 2021-01-01
6: 4 2015-12-31 <NA> <NA>
7: 5 2010-09-30 2010-04-01 2021-01-01
8: 6 2018-10-31 2014-05-01 2016-10-31
9: 6 2018-10-31 2016-11-01 2021-01-01
10: 7 2016-02-01 2016-01-01 2021-01-01
11: 8 2015-05-01 2013-04-15 2014-12-31
12: 8 2015-05-01 2015-01-01 2015-05-01
13: 9 2013-09-01 2010-02-15 2013-01-01
14: 10 2016-01-01 2012-04-01 2021-01-01
我想检查dateA是否在dateB和dateC的区间内: 我的代码:
library(dplyr)
df1 %>%
mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))
给出:
Error: Problem with `mutate()` column `match`.
i `match = ifelse(between(dateA, dateB, dateC), 1, 0)`.
x Not yet implemented NAbounds=TRUE for this non-numeric and non-character type
如果我删除包含 NA
的行,代码将起作用:
df1 %>%
slice(-6) %>%
mutate(match= ifelse(between(dateA, dateB, dateC), 1, 0))
我想知道,我可以离开 NA
行并执行我的代码吗?
关于 OP 使用的 between
存在混淆,因为输入对象是 data.table
并且使用的代码是 dplyr
。因此,如果我们假设两个包都已加载,则每个包中都有一个 between
函数,并且根据最后加载的包,前一个包中的 between
将被屏蔽。如果使用 dplyr::between
,它没有完全向量化,它被记录在 ?dplyr::between
left, right Boundary values (must be scalars).
df1 %>%
rowwise %>%
mutate(match = +(dplyr::between(dateA, dateB, dateC))) %>%
ungroup
-输出
# A tibble: 14 × 5
ID dateA dateB dateC match
<dbl> <date> <date> <date> <int>
1 1 2010-12-31 2010-01-01 2021-01-01 1
2 2 2021-01-01 2012-01-01 2012-12-31 0
3 2 2021-01-01 2013-01-01 2017-11-30 0
4 2 2021-01-01 2017-12-01 2021-01-01 1
5 3 2010-09-30 2010-05-01 2021-01-01 1
6 4 2015-12-31 NA NA NA
7 5 2010-09-30 2010-04-01 2021-01-01 1
8 6 2018-10-31 2014-05-01 2016-10-31 0
9 6 2018-10-31 2016-11-01 2021-01-01 1
10 7 2016-02-01 2016-01-01 2021-01-01 1
11 8 2015-05-01 2013-04-15 2014-12-31 0
12 8 2015-05-01 2015-01-01 2015-05-01 1
13 9 2013-09-01 2010-02-15 2013-01-01 0
14 10 2016-01-01 2012-04-01 2021-01-01 1
然而,?data.table::between
并非如此(根据 OP 的 post 中显示的错误,似乎使用的 between
来自 data.table
,
lower - Lower range bound. Either length 1 or same length as x.
upper - Upper range bound. Either length 1 or same length as x.
但是 class
可能是个问题,尽管它另有说明
x- Any orderable vector, i.e., those with relevant methods for
<=
, such as numeric, character, Date, etc. in case of between and a numeric vector in case of inrange.
从 Date
class 转换为 integer/numeric
应该可以工作
df1 %>%
mutate(match = +(data.table::between(as.numeric(dateA),
as.numeric(dateB), as.numeric(dateC))))
-输出
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> 1
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
通过深入研究,问题出在参数 NAbounds
中,默认情况下是 TRUE
。在 OP 的数据中,有一个 NA
元素
df1 %>%
mutate(match = data.table::between(dateA, dateB, dateC))
Error: Problem with
mutate()
columnmatch
. ℹmatch = data.table::between(dateA, dateB, dateC)
. ✖ Not yet implemented NAbounds=TRUE for this non-numeric and non-character type Runrlang::last_error()
to see where the error occurred.
我们可能需要将其设置为 FALSE
df1 %>%
mutate(match = +(data.table::between(dateA, dateB, dateC, NAbounds = FALSE)))
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> NA
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
或者也可以用 as.Date
NA
进行转换
df1 %>%
mutate(match = +(data.table::between(dateA, dateB, dateC,
NAbounds = as.Date(NA))))
ID dateA dateB dateC match
1: 1 2010-12-31 2010-01-01 2021-01-01 1
2: 2 2021-01-01 2012-01-01 2012-12-31 0
3: 2 2021-01-01 2013-01-01 2017-11-30 0
4: 2 2021-01-01 2017-12-01 2021-01-01 1
5: 3 2010-09-30 2010-05-01 2021-01-01 1
6: 4 2015-12-31 <NA> <NA> NA
7: 5 2010-09-30 2010-04-01 2021-01-01 1
8: 6 2018-10-31 2014-05-01 2016-10-31 0
9: 6 2018-10-31 2016-11-01 2021-01-01 1
10: 7 2016-02-01 2016-01-01 2021-01-01 1
11: 8 2015-05-01 2013-04-15 2014-12-31 0
12: 8 2015-05-01 2015-01-01 2015-05-01 1
13: 9 2013-09-01 2010-02-15 2013-01-01 0
14: 10 2016-01-01 2012-04-01 2021-01-01 1
library(tidyverse)
library(lubridate)
df1 %>%
mutate(res = +(dateA %within% interval(dateB, dateC)))
#> ID dateA dateB dateC res
#> 1 1 2010-12-31 2010-01-01 2021-01-01 1
#> 2 2 2021-01-01 2012-01-01 2012-12-31 0
#> 3 2 2021-01-01 2013-01-01 2017-11-30 0
#> 4 2 2021-01-01 2017-12-01 2021-01-01 1
#> 5 3 2010-09-30 2010-05-01 2021-01-01 1
#> 6 4 2015-12-31 <NA> <NA> NA
#> 7 5 2010-09-30 2010-04-01 2021-01-01 1
#> 8 6 2018-10-31 2014-05-01 2016-10-31 0
#> 9 6 2018-10-31 2016-11-01 2021-01-01 1
#> 10 7 2016-02-01 2016-01-01 2021-01-01 1
#> 11 8 2015-05-01 2013-04-15 2014-12-31 0
#> 12 8 2015-05-01 2015-01-01 2015-05-01 1
#> 13 9 2013-09-01 2010-02-15 2013-01-01 0
#> 14 10 2016-01-01 2012-04-01 2021-01-01 1
数据
df1 <- structure(
list(
ID = c(1, 2, 2, 2, 3, 4, 5, 6, 6, 7, 8, 8, 9,
10),
dateA = structure(
c(
14974,
18628,
18628,
18628,
14882,
16800,
14882,
17835,
17835,
16832,
16556,
16556,
15949,
16801
),
class = "Date"
),
dateB = structure(
c(
14610,
15340,
15706,
17501,
14730,
NA,
14700,
16191,
17106,
16801,
15810,
16436,
14655,
15431
),
class = "Date"
),
dateC = structure(
c(
18628,
15705,
17500,
18628,
18628,
NA,
18628,
17105,
18628,
18628,
16435,
16556,
15706,
18628
),
class = "Date"
)
),
row.names = c(NA,-14L),
class = c("data.table", "data.frame")
)