根据前导字符将逗号分隔的字符串解析为向量

Question

给定一个字符串：

vals <- "-AB, CV, CL, -TS"

我想高效将vals解析为两个向量（我们称它们为negative和positive），一个包含值以 - 为前缀，其他则没有。一个问题是我还想删除 - 指标。

想要的结果：

> negative
[1] "AB" "TS"
> positive
[1] "CV" "CL"

简洁回答加分。

Answer 1

尝试：

v <- trimws(strsplit(vals, ",")[[1]])

positive <- v[!startsWith(v, '-')]
negative <- substring(v[startsWith(v, '-')], 2)

输出：

> negative
[1] "AB" "TS"
> positive
[1] "CV" "CL"

Answer 2

你可以试试：

s <- trimws(strsplit(vals, ",")[[1]])
negative <- s[grepl("^-", s)]
positive <- s[!grepl("^-", s)]

或者你可以这样使用纯正则表达式

library(stringr)
negative <- as.vector(str_match_all(vals, "-\w+")[[1]])
positive <- as.vector(str_match_all(vals, "(?<!-)(?<=^|,| )\w+")[[1]])

Answer 3

您可以尝试使用 grep 和 value = True 选项，同样由于您的数据有前导空格，要删除它们您可以使用 trimws。我在这里使用 strsplit 和“,”作为分隔符。使用 zeallot 库只需一步即可分配所有内容。

library(zeallot)
c(negative, positive) %<-% list(grep("^-",trimws(strsplit(vals,",")[[1]]), value=T), grep("^[^-]",trimws(strsplit(vals,",")[[1]]), value=T))

输出:

#> negative
#[1] "-AB" "-TS"
#> positive
#[1] "CV" "CL"

根据前导字符将逗号分隔的字符串解析为向量

Parse comma-delimited string into vectors based on leading character

r

stringr