解释R包```stringr```中```str_match_all```的行为
Explain the behavior of ```str_match_all``` in R package ```stringr```
st = list("amber johnson", "anhar link ari")
t = stringr::str_match_all(st, "(\ba[a-z]+\b)")
str(t)
# List of 2
# $ : chr [1, 1:2] "amber" "amber"
# $ : chr [1:2, 1:2] "anhar" "ari" "anhar" "ari"
为什么结果会这样重复?
如果您查看 ?str_match_all
值,它会显示:
For str_match, a character matrix. First column is the complete match,
followed by one column for each capture group. For str_match_all, a
list of character matrices.
由于您的模式包含一个捕获组,因此结果包含两列,一列用于完全匹配,一列用于捕获组。如果你不想要重复的列,你可以从模式中删除组括号:
st = list("amber johnson", "anhar link ari")
t = str_match_all(st, "\ba[a-z]+\b")
str(t)
给出:
# List of 2
# $ : chr [1, 1] "amber"
# $ : chr [1:2, 1] "anhar" "ari"
st = list("amber johnson", "anhar link ari")
t = stringr::str_match_all(st, "(\ba[a-z]+\b)")
str(t)
# List of 2
# $ : chr [1, 1:2] "amber" "amber"
# $ : chr [1:2, 1:2] "anhar" "ari" "anhar" "ari"
为什么结果会这样重复?
如果您查看 ?str_match_all
值,它会显示:
For str_match, a character matrix. First column is the complete match, followed by one column for each capture group. For str_match_all, a list of character matrices.
由于您的模式包含一个捕获组,因此结果包含两列,一列用于完全匹配,一列用于捕获组。如果你不想要重复的列,你可以从模式中删除组括号:
st = list("amber johnson", "anhar link ari")
t = str_match_all(st, "\ba[a-z]+\b")
str(t)
给出:
# List of 2
# $ : chr [1, 1] "amber"
# $ : chr [1:2, 1] "anhar" "ari"