R 中用于返回正则表达式中匹配的单词数的语法是什么?
What is the syntax in R for returning the number of words matched in regular expression?
R 包:stringr::words
我想知道应用以下正则表达式后 stringr::words 文件中正好三个字母长的单词数:
x <- str_view(words, "^...$", match = TRUE)
虽然代码能够提取恰好三个字母长的单词,但它没有告诉我有多少单词。所以,我认为长度函数将适合查找数字。
length(x)
代码returns8,这不可能,因为很明显x大于8。
计算与正则表达式匹配后的单词数的正确语法是什么,在本例中为 x?
此外,谁能给我解释一下为什么上面例子中的length(x) returns 8?
提前谢谢你。
我建议使用 grep
和 length
:
length(grep("^.{3}$", words))
# => [1] 110
使用 grep
,您实际上得到了 words
的一个子集,length
将 return 找到匹配项的计数。
stringr::str_view
可以用来查看HTML正则表达式匹配的渲染,实际上并不是return匹配列表。除了 grep
,您还可以使用 stringr::str_subset
.
str_view
returns 用于查看的 HTML 对象。
x <- str_view(words, "^...$", match = TRUE)
class(x)
#[1] "str_view" "htmlwidget"
您看到的 8 个组件是
names(x)
#[1] "x" "width" "height" "sizingPolicy" "dependencies"
#[6] "elementId" "preRenderHook" "jsHooks"
而不是 str_view
使用 str_subset
:
library(stringr)
x <- str_subset(words, "^...$")
x
# [1] "act" "add" "age" "ago" "air" "all" "and" "any" "arm" "art" "ask" "bad" "bag"
# [14] "bar" "bed" "bet" "big" "bit" "box" "boy" "bus" "but" "buy" "can" "car" "cat"
# [27] "cup" "cut" "dad" "day" "die" "dog" "dry" "due" "eat" "egg" "end" "eye" "far"
# [40] "few" "fit" "fly" "for" "fun" "gas" "get" "god" "guy" "hit" "hot" "how" "job"
# [53] "key" "kid" "lad" "law" "lay" "leg" "let" "lie" "lot" "low" "man" "may" "mrs"
# [66] "new" "non" "not" "now" "odd" "off" "old" "one" "out" "own" "pay" "per" "put"
# [79] "red" "rid" "run" "say" "see" "set" "sex" "she" "sir" "sit" "six" "son" "sun"
# [92] "tax" "tea" "ten" "the" "tie" "too" "top" "try" "two" "use" "war" "way" "wee"
#[105] "who" "why" "win" "yes" "yet" "you"
length(x)
#[1] 110
另一种选择是str_count
:
library(stringr)
sum(str_count(x, "^...$"))
[1] 3
数据:
x <- c("abc", "abcd", "ab", "abc", "abcsd", "edf")
R 包:stringr::words
我想知道应用以下正则表达式后 stringr::words 文件中正好三个字母长的单词数:
x <- str_view(words, "^...$", match = TRUE)
虽然代码能够提取恰好三个字母长的单词,但它没有告诉我有多少单词。所以,我认为长度函数将适合查找数字。
length(x)
代码returns8,这不可能,因为很明显x大于8。
计算与正则表达式匹配后的单词数的正确语法是什么,在本例中为 x?
此外,谁能给我解释一下为什么上面例子中的length(x) returns 8?
提前谢谢你。
我建议使用 grep
和 length
:
length(grep("^.{3}$", words))
# => [1] 110
使用 grep
,您实际上得到了 words
的一个子集,length
将 return 找到匹配项的计数。
stringr::str_view
可以用来查看HTML正则表达式匹配的渲染,实际上并不是return匹配列表。除了 grep
,您还可以使用 stringr::str_subset
.
str_view
returns 用于查看的 HTML 对象。
x <- str_view(words, "^...$", match = TRUE)
class(x)
#[1] "str_view" "htmlwidget"
您看到的 8 个组件是
names(x)
#[1] "x" "width" "height" "sizingPolicy" "dependencies"
#[6] "elementId" "preRenderHook" "jsHooks"
而不是 str_view
使用 str_subset
:
library(stringr)
x <- str_subset(words, "^...$")
x
# [1] "act" "add" "age" "ago" "air" "all" "and" "any" "arm" "art" "ask" "bad" "bag"
# [14] "bar" "bed" "bet" "big" "bit" "box" "boy" "bus" "but" "buy" "can" "car" "cat"
# [27] "cup" "cut" "dad" "day" "die" "dog" "dry" "due" "eat" "egg" "end" "eye" "far"
# [40] "few" "fit" "fly" "for" "fun" "gas" "get" "god" "guy" "hit" "hot" "how" "job"
# [53] "key" "kid" "lad" "law" "lay" "leg" "let" "lie" "lot" "low" "man" "may" "mrs"
# [66] "new" "non" "not" "now" "odd" "off" "old" "one" "out" "own" "pay" "per" "put"
# [79] "red" "rid" "run" "say" "see" "set" "sex" "she" "sir" "sit" "six" "son" "sun"
# [92] "tax" "tea" "ten" "the" "tie" "too" "top" "try" "two" "use" "war" "way" "wee"
#[105] "who" "why" "win" "yes" "yet" "you"
length(x)
#[1] 110
另一种选择是str_count
:
library(stringr)
sum(str_count(x, "^...$"))
[1] 3
数据:
x <- c("abc", "abcd", "ab", "abc", "abcsd", "edf")