在特定字符串后提取数字

Question

我需要找到字符串 "Count of" 后面的数字。在 "Count of" 字符串和数字之间可以有一个 space 或一个符号。我有一些适用于 www.regex101.com 但不适用于 stringr str_extract 函数的东西。

library(stringr)

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "count of ([\d]+)")
[1] NA NA NA NA "count of 5" "count of 50" NA

我想得到的：

[1] NA NA NA NA "5" "50" "10"

Answer 1

as.numeric(sub("(?i).*count of.*?(\d+).*", "\1", shopping_list))
[1] NA NA NA NA  5 50 10

正则表达式模式是：

(?i): 忽略大小写
.*count of.*?: 任何长度的字符最多 "count of"
(\d+)：捕获一个或多个数字
"\1": Return 捕获组

截至目前，其他答案将因 ""coconut count of - 5" 之类的内容而失败，因为它们在 "count of".

之后受到一个 space 的约束

Answer 2

向前看，向后看就是您要使用此 grep 寻找的内容...

shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "(?<=count of )[0-9]*")
[1] NA   NA   NA   NA   "5"  "50" NA

Answer 3

str_extract(shopping_list, "(?i)(?<=count of\D)\d+")
# [1] NA   NA   NA   NA   "5"  "50" "10"

其中 (?i) 使模式不区分大小写，\D 表示不是数字，?<= 是正后向。

在特定字符串后提取数字

extract number after specific string

regex

r

stringr