在特定字符串后提取数字
extract number after specific string
我需要找到字符串 "Count of" 后面的数字。在 "Count of" 字符串和数字之间可以有一个 space 或一个符号。我有一些适用于 www.regex101.com 但不适用于 stringr str_extract
函数的东西。
library(stringr)
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "count of ([\d]+)")
[1] NA NA NA NA "count of 5" "count of 50" NA
我想得到的:
[1] NA NA NA NA "5" "50" "10"
as.numeric(sub("(?i).*count of.*?(\d+).*", "\1", shopping_list))
[1] NA NA NA NA 5 50 10
正则表达式模式是:
(?i)
: 忽略大小写
.*count of.*?
: 任何长度的字符最多 "count of"
(\d+)
:捕获一个或多个数字
"\1"
: Return 捕获组
截至目前,其他答案将因 ""coconut count of - 5"
之类的内容而失败,因为它们在 "count of".
之后受到一个 space 的约束
向前看,向后看就是您要使用此 grep 寻找的内容...
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "(?<=count of )[0-9]*")
[1] NA NA NA NA "5" "50" NA
str_extract(shopping_list, "(?i)(?<=count of\D)\d+")
# [1] NA NA NA NA "5" "50" "10"
其中 (?i)
使模式不区分大小写,\D
表示不是数字,?<=
是正后向。
我需要找到字符串 "Count of" 后面的数字。在 "Count of" 字符串和数字之间可以有一个 space 或一个符号。我有一些适用于 www.regex101.com 但不适用于 stringr str_extract
函数的东西。
library(stringr)
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "count of ([\d]+)")
[1] NA NA NA NA "count of 5" "count of 50" NA
我想得到的:
[1] NA NA NA NA "5" "50" "10"
as.numeric(sub("(?i).*count of.*?(\d+).*", "\1", shopping_list))
[1] NA NA NA NA 5 50 10
正则表达式模式是:
(?i)
: 忽略大小写.*count of.*?
: 任何长度的字符最多 "count of"(\d+)
:捕获一个或多个数字"\1"
: Return 捕获组
截至目前,其他答案将因 ""coconut count of - 5"
之类的内容而失败,因为它们在 "count of".
向前看,向后看就是您要使用此 grep 寻找的内容...
shopping_list <- c("apples x4", "bag of flour", "bag of sugar", "milk x2", "monkey coconut 3oz count of 5", "monkey coconut count of 50", "chicken Count Of-10")
str_extract(shopping_list, "(?<=count of )[0-9]*")
[1] NA NA NA NA "5" "50" NA
str_extract(shopping_list, "(?i)(?<=count of\D)\d+")
# [1] NA NA NA NA "5" "50" "10"
其中 (?i)
使模式不区分大小写,\D
表示不是数字,?<=
是正后向。