NextMethod() 中的错误:只有 0 可以与负下标混合
Error in NextMethod() : only 0's may be mixed with negative subscripts
我准备了一个假数据框来在这里问一个关于分割的问题。但奇怪的是,我在尝试使用 `filter 过滤 dplyr
中的数据帧时遇到了另一个错误,它不断抛出此错误:
Error in NextMethod() : only 0's may be mixed with negative subscripts
最离奇的事情!
这是抛出错误的代码:
datv %>%
dplyr::filter(str_detect(campaign, "campaign_z|campaign_x"))
这是数据框:
structure(list(campaign = c("campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z"), com_elm = c("campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_B1", "campaign_x_C3",
"campaign_x_B1", "campaign_x_A1", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B1", "campaign_x_C3", "campaign_x_B1", "campaign_x_A1",
"campaign_x_C3", "campaign_x_C3", "campaign_x_B1", "campaign_x_B2",
"campaign_x_C3", "campaign_x_B1", "campaign_x_C3", "campaign_x_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_B1", "campaign_y_C3", "campaign_y_B1", "campaign_y_A1",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B1", "campaign_y_C3",
"campaign_y_B1", "campaign_y_A1", "campaign_y_C3", "campaign_y_C3",
"campaign_y_B1", "campaign_y_B2", "campaign_y_C3", "campaign_y_B1",
"campaign_y_C3", "campaign_y_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_B1", "campaign_z_C3",
"campaign_z_B1", "campaign_z_A1", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B1", "campaign_z_C3", "campaign_z_B1", "campaign_z_A1",
"campaign_z_C3", "campaign_z_C3", "campaign_z_B1", "campaign_z_B2",
"campaign_z_C3", "campaign_z_B1", "campaign_z_C3", "campaign_z_C3"
), com_elm_id = c(808001L, 811001L, 814001L, 509005L, 729060L,
817002L, 820002L, 792002L, 793003L, 820003L, 824003L, 792002L,
811001L, 787001L, 811001L, 468023L, 792002L, 812001L, 812001L,
808001L, 811001L, 468023L, 468006L, 491014L, 825002L, 828002L,
741001L, 825002L, 512001L, 733001L, 808001L, 811001L, 814001L,
509005L, 729060L, 817002L, 820002L, 792002L, 793003L, 820003L,
824003L, 792002L, 811001L, 787001L, 811001L, 468023L, 792002L,
812001L, 812001L, 808001L, 811001L, 468023L, 468006L, 491014L,
825002L, 828002L, 741001L, 825002L, 512001L, 733001L, 808001L,
811001L, 814001L, 509005L, 729060L, 817002L, 820002L, 792002L,
793003L, 820003L, 824003L, 792002L, 811001L, 787001L, 811001L,
468023L, 792002L, 812001L, 812001L, 808001L, 811001L, 468023L,
468006L, 491014L, 825002L, 828002L, 741001L, 825002L, 512001L,
733001L), recipient_id = c(5432L, 5432L, 5432L, 197L, 197L, 8388L,
8388L, 8426L, 8426L, 10903L, 10903L, 14469L, 14469L, 17466L,
17466L, 17807L, 21666L, 23935L, 24287L, 25412L, 25412L, 31361L,
31361L, 31361L, 31365L, 31365L, 40849L, 40860L, 41737L, 41737L,
5432L, 5432L, 5432L, 197L, 197L, 8388L, 8388L, 8426L, 8426L,
10903L, 10903L, 1446945L, 1446945L, 1746645L, 1746645L, 1780745L,
2166645L, 2393545L, 24287L, 25412L, 25412L, 3136145L, 3136145L,
3136145L, 3136545L, 3136545L, 40849L, 40860L, 4173745L, 4173745L,
5432L, 5432L, 5432L, 19732L, 19732L, 838832L, 838832L, 842632L,
842632L, 10903L, 10903L, 14469L, 14469L, 1746632L, 1746632L,
1780732L, 2166645L, 2393545L, 2428745L, 25412L, 25412L, 3136145L,
3136145L, 3136145L, 3136545L, 3136545L, 40849L, 40860L, 41737L,
41737L), step = c(3, 1, 2, 3, 3, 1, 2, 3, 3, 1, 2, 3, 1, 3, 1,
1, 3, 1, 1, 3, 1, 1, 3, 3, 1, 2, 3, 1, 3, 3, 3, 1, 2, 3, 3, 1,
2, 3, 3, 1, 2, 3, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3, 3, 1, 2, 3,
1, 3, 3, 3, 1, 2, 3, 3, 1, 2, 3, 3, 1, 2, 3, 1, 3, 1, 1, 3, 1,
1, 3, 1, 1, 3, 3, 1, 2, 3, 1, 3, 3), date = structure(c(19029,
19032, 19035, 18778, 18960, 19037, 19040, 19016, 19019, 19040,
19043, 19015, 19032, 19011, 19032, 18746, 19015, 19033, 19033,
19029, 19032, 18746, 18746, 18764, 19044, 19047, 18969, 19044,
18781, 18962, 19029, 19032, 19035, 18778, 18960, 19037, 19040,
19016, 19019, 19040, 19043, 19015, 19032, 19011, 19032, 18746,
19015, 19033, 19033, 19029, 19032, 18746, 18746, 18764, 19044,
19047, 18969, 19044, 18781, 18962, 19029, 19032, 19035, 18778,
18960, 19037, 19040, 19016, 19019, 19040, 19043, 19015, 19032,
19011, 19032, 18746, 19015, 19033, 19033, 19029, 19032, 18746,
18746, 18764, 19044, 19047, 18969, 19044, 18781, 18962), class = "Date")), row.names = c(NA,
-90L), groups = structure(list(campaign = c("campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x"
), recipient_id = c(54L, 197L, 8388L, 8426L, 10903L, 14469L,
17466L, 17807L, 21666L, 23935L, 24287L, 25412L, 31361L, 31365L,
40849L, 40860L, 41737L), .rows = structure(list(1:3, 4:5, 6:7,
8:9, 10:11, 12:13, 14:15, 16L, 17L, 18L, 19L, 20:21, 22:24,
25:26, 27L, 28L, 29:30), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -17L), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
知道是什么原因造成的吗?
将上面的代码与我的其余代码组合在一起时,如下所示:
datv %>% filter(str_detect(campaign, "campaign_z|campaign_x")) %>%
group_by(recipient_id) %>%
summarise(rank = dense_rank(campaign)) %>%
ungroup() %>%
group_by(recipient_id,rank) %>%
summarise(count = n()) %>% #used to remove duplicates with count() to sense check work. Can use distinct()
ungroup() %>%
group_by(recipient_id) %>%
summarise(max_rank=max(rank)) %>%
ungroup() %>%
group_by(max_rank) %>%
summarise(count=n())
我收到另一个警告和错误:
Warning in NextMethod() :
number of items to replace is not a multiple of replacement length
Error:
! Assigned data `rows` must be compatible with existing data.
x Existing data has 17 rows.
x Assigned data has 925904432 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
1. ... %>% summarise(count = n())
28. tibble `<fn>`(`<vctrs___>`)
要么是我疯了,要么是我的 R 安装损坏了,要么是有什么不寻常的事情在起作用!
这是一个分组数据集,filter
应用于其中一个分组列。
library(dplyr)
group_vars(datv)
[1] "campaign" "recipient_id"
相反,ungroup
然后应用
library(stringr)
datv %>%
ungroup %>%
dplyr::filter(str_detect(campaign, "campaign_z|campaign_x"))
-输出
# A tibble: 60 × 6
campaign com_elm com_elm_id recipient_id step date
<chr> <chr> <int> <int> <dbl> <date>
1 campaign_x campaign_x_C3 808001 5432 3 2022-02-06
2 campaign_x campaign_x_B1 811001 5432 1 2022-02-09
3 campaign_x campaign_x_B2 814001 5432 2 2022-02-12
4 campaign_x campaign_x_C3 509005 197 3 2021-05-31
5 campaign_x campaign_x_C3 729060 197 3 2021-11-29
6 campaign_x campaign_x_B1 817002 8388 1 2022-02-14
7 campaign_x campaign_x_B2 820002 8388 2 2022-02-17
8 campaign_x campaign_x_C3 792002 8426 3 2022-01-24
9 campaign_x campaign_x_C3 793003 8426 3 2022-01-27
10 campaign_x campaign_x_B1 820003 10903 1 2022-02-17
# … with 50 more rows
此外,如果我们使用 summarise
中的 .groups
参数,则可以删除 summarise
之后的 ungroup
datv %>%
ungroup %>%
filter(str_detect(campaign, "campaign_z|campaign_x")) %>%
group_by(recipient_id) %>%
summarise(rank = dense_rank(campaign), .groups = 'drop') %>%
group_by(recipient_id,rank) %>%
summarise(count = n(), .groups = 'drop') %>%
group_by(recipient_id) %>%
summarise(max_rank=max(rank), .groups = 'drop') %>%
group_by(max_rank) %>%
summarise(count=n(), .groups = 'drop')
-输出
# A tibble: 2 × 2
max_rank count
<int> <int>
1 1 20
2 2 7
我准备了一个假数据框来在这里问一个关于分割的问题。但奇怪的是,我在尝试使用 `filter 过滤 dplyr
中的数据帧时遇到了另一个错误,它不断抛出此错误:
Error in NextMethod() : only 0's may be mixed with negative subscripts
最离奇的事情!
这是抛出错误的代码:
datv %>%
dplyr::filter(str_detect(campaign, "campaign_z|campaign_x"))
这是数据框:
structure(list(campaign = c("campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_y", "campaign_y", "campaign_y",
"campaign_y", "campaign_y", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z", "campaign_z", "campaign_z", "campaign_z",
"campaign_z", "campaign_z"), com_elm = c("campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B2", "campaign_x_C3", "campaign_x_B1", "campaign_x_C3",
"campaign_x_B1", "campaign_x_A1", "campaign_x_C3", "campaign_x_B1",
"campaign_x_B1", "campaign_x_C3", "campaign_x_B1", "campaign_x_A1",
"campaign_x_C3", "campaign_x_C3", "campaign_x_B1", "campaign_x_B2",
"campaign_x_C3", "campaign_x_B1", "campaign_x_C3", "campaign_x_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B2", "campaign_y_C3",
"campaign_y_B1", "campaign_y_C3", "campaign_y_B1", "campaign_y_A1",
"campaign_y_C3", "campaign_y_B1", "campaign_y_B1", "campaign_y_C3",
"campaign_y_B1", "campaign_y_A1", "campaign_y_C3", "campaign_y_C3",
"campaign_y_B1", "campaign_y_B2", "campaign_y_C3", "campaign_y_B1",
"campaign_y_C3", "campaign_y_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B2", "campaign_z_C3", "campaign_z_B1", "campaign_z_C3",
"campaign_z_B1", "campaign_z_A1", "campaign_z_C3", "campaign_z_B1",
"campaign_z_B1", "campaign_z_C3", "campaign_z_B1", "campaign_z_A1",
"campaign_z_C3", "campaign_z_C3", "campaign_z_B1", "campaign_z_B2",
"campaign_z_C3", "campaign_z_B1", "campaign_z_C3", "campaign_z_C3"
), com_elm_id = c(808001L, 811001L, 814001L, 509005L, 729060L,
817002L, 820002L, 792002L, 793003L, 820003L, 824003L, 792002L,
811001L, 787001L, 811001L, 468023L, 792002L, 812001L, 812001L,
808001L, 811001L, 468023L, 468006L, 491014L, 825002L, 828002L,
741001L, 825002L, 512001L, 733001L, 808001L, 811001L, 814001L,
509005L, 729060L, 817002L, 820002L, 792002L, 793003L, 820003L,
824003L, 792002L, 811001L, 787001L, 811001L, 468023L, 792002L,
812001L, 812001L, 808001L, 811001L, 468023L, 468006L, 491014L,
825002L, 828002L, 741001L, 825002L, 512001L, 733001L, 808001L,
811001L, 814001L, 509005L, 729060L, 817002L, 820002L, 792002L,
793003L, 820003L, 824003L, 792002L, 811001L, 787001L, 811001L,
468023L, 792002L, 812001L, 812001L, 808001L, 811001L, 468023L,
468006L, 491014L, 825002L, 828002L, 741001L, 825002L, 512001L,
733001L), recipient_id = c(5432L, 5432L, 5432L, 197L, 197L, 8388L,
8388L, 8426L, 8426L, 10903L, 10903L, 14469L, 14469L, 17466L,
17466L, 17807L, 21666L, 23935L, 24287L, 25412L, 25412L, 31361L,
31361L, 31361L, 31365L, 31365L, 40849L, 40860L, 41737L, 41737L,
5432L, 5432L, 5432L, 197L, 197L, 8388L, 8388L, 8426L, 8426L,
10903L, 10903L, 1446945L, 1446945L, 1746645L, 1746645L, 1780745L,
2166645L, 2393545L, 24287L, 25412L, 25412L, 3136145L, 3136145L,
3136145L, 3136545L, 3136545L, 40849L, 40860L, 4173745L, 4173745L,
5432L, 5432L, 5432L, 19732L, 19732L, 838832L, 838832L, 842632L,
842632L, 10903L, 10903L, 14469L, 14469L, 1746632L, 1746632L,
1780732L, 2166645L, 2393545L, 2428745L, 25412L, 25412L, 3136145L,
3136145L, 3136145L, 3136545L, 3136545L, 40849L, 40860L, 41737L,
41737L), step = c(3, 1, 2, 3, 3, 1, 2, 3, 3, 1, 2, 3, 1, 3, 1,
1, 3, 1, 1, 3, 1, 1, 3, 3, 1, 2, 3, 1, 3, 3, 3, 1, 2, 3, 3, 1,
2, 3, 3, 1, 2, 3, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3, 3, 1, 2, 3,
1, 3, 3, 3, 1, 2, 3, 3, 1, 2, 3, 3, 1, 2, 3, 1, 3, 1, 1, 3, 1,
1, 3, 1, 1, 3, 3, 1, 2, 3, 1, 3, 3), date = structure(c(19029,
19032, 19035, 18778, 18960, 19037, 19040, 19016, 19019, 19040,
19043, 19015, 19032, 19011, 19032, 18746, 19015, 19033, 19033,
19029, 19032, 18746, 18746, 18764, 19044, 19047, 18969, 19044,
18781, 18962, 19029, 19032, 19035, 18778, 18960, 19037, 19040,
19016, 19019, 19040, 19043, 19015, 19032, 19011, 19032, 18746,
19015, 19033, 19033, 19029, 19032, 18746, 18746, 18764, 19044,
19047, 18969, 19044, 18781, 18962, 19029, 19032, 19035, 18778,
18960, 19037, 19040, 19016, 19019, 19040, 19043, 19015, 19032,
19011, 19032, 18746, 19015, 19033, 19033, 19029, 19032, 18746,
18746, 18764, 19044, 19047, 18969, 19044, 18781, 18962), class = "Date")), row.names = c(NA,
-90L), groups = structure(list(campaign = c("campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x",
"campaign_x", "campaign_x", "campaign_x", "campaign_x", "campaign_x"
), recipient_id = c(54L, 197L, 8388L, 8426L, 10903L, 14469L,
17466L, 17807L, 21666L, 23935L, 24287L, 25412L, 31361L, 31365L,
40849L, 40860L, 41737L), .rows = structure(list(1:3, 4:5, 6:7,
8:9, 10:11, 12:13, 14:15, 16L, 17L, 18L, 19L, 20:21, 22:24,
25:26, 27L, 28L, 29:30), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -17L), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
知道是什么原因造成的吗?
将上面的代码与我的其余代码组合在一起时,如下所示:
datv %>% filter(str_detect(campaign, "campaign_z|campaign_x")) %>%
group_by(recipient_id) %>%
summarise(rank = dense_rank(campaign)) %>%
ungroup() %>%
group_by(recipient_id,rank) %>%
summarise(count = n()) %>% #used to remove duplicates with count() to sense check work. Can use distinct()
ungroup() %>%
group_by(recipient_id) %>%
summarise(max_rank=max(rank)) %>%
ungroup() %>%
group_by(max_rank) %>%
summarise(count=n())
我收到另一个警告和错误:
Warning in NextMethod() :
number of items to replace is not a multiple of replacement length
Error:
! Assigned data `rows` must be compatible with existing data.
x Existing data has 17 rows.
x Assigned data has 925904432 rows.
ℹ Only vectors of size 1 are recycled.
Backtrace:
1. ... %>% summarise(count = n())
28. tibble `<fn>`(`<vctrs___>`)
要么是我疯了,要么是我的 R 安装损坏了,要么是有什么不寻常的事情在起作用!
这是一个分组数据集,filter
应用于其中一个分组列。
library(dplyr)
group_vars(datv)
[1] "campaign" "recipient_id"
相反,ungroup
然后应用
library(stringr)
datv %>%
ungroup %>%
dplyr::filter(str_detect(campaign, "campaign_z|campaign_x"))
-输出
# A tibble: 60 × 6
campaign com_elm com_elm_id recipient_id step date
<chr> <chr> <int> <int> <dbl> <date>
1 campaign_x campaign_x_C3 808001 5432 3 2022-02-06
2 campaign_x campaign_x_B1 811001 5432 1 2022-02-09
3 campaign_x campaign_x_B2 814001 5432 2 2022-02-12
4 campaign_x campaign_x_C3 509005 197 3 2021-05-31
5 campaign_x campaign_x_C3 729060 197 3 2021-11-29
6 campaign_x campaign_x_B1 817002 8388 1 2022-02-14
7 campaign_x campaign_x_B2 820002 8388 2 2022-02-17
8 campaign_x campaign_x_C3 792002 8426 3 2022-01-24
9 campaign_x campaign_x_C3 793003 8426 3 2022-01-27
10 campaign_x campaign_x_B1 820003 10903 1 2022-02-17
# … with 50 more rows
此外,如果我们使用 summarise
.groups
参数,则可以删除 summarise
之后的 ungroup
datv %>%
ungroup %>%
filter(str_detect(campaign, "campaign_z|campaign_x")) %>%
group_by(recipient_id) %>%
summarise(rank = dense_rank(campaign), .groups = 'drop') %>%
group_by(recipient_id,rank) %>%
summarise(count = n(), .groups = 'drop') %>%
group_by(recipient_id) %>%
summarise(max_rank=max(rank), .groups = 'drop') %>%
group_by(max_rank) %>%
summarise(count=n(), .groups = 'drop')
-输出
# A tibble: 2 × 2
max_rank count
<int> <int>
1 1 20
2 2 7