根据条件替换R中字符串的第一个元素
Replace first element of a string in R based on a condition
如果满足条件,我想将x中字符串的第一个元素替换为空白:如果x中“101”的第一个元素与y中的第一个字符串匹配,则将“101”的第一个元素替换为空白.
x = c("101", "201", "301")
y = c("1", "7", "3")
想要:
> x
[1] "01" "201" "01"
我正在尝试:
> ifelse(substr(x, 1, 1) == y, sub(substr(x, 1, 1), ""), x)
我知道这是错误的,但不是直觉上的- sub
需要一个模式作为第一个参数并且不会采用 substr
.
也尝试过:
> ifelse(substr(x, 1, 1) == y, substr(x, 1, 1) <- "", x)
[1] "" "201" ""
我参考了这个 R: How can I replace let's say the 5th element within a string? 并使用
解决了它
ifelse(substr(x, 1, 1) == y, paste(substr(x, 2, nchar(x))), x)
想知道是否有更好的方法?
不知道它是否更好,但在这种情况下您始终可以使用 mapply()
:
x <- c("apple", "bog", "cat", "dog")
y <- c('a', 'b', 'b', 'd')
logi <- mapply(`==`, substr(x,1,1), y)
substr(x[logi],1,1) <- ""
x
[1] "pple" "og" "cat" "og"
匹配第一个字符的正则表达式是"^."
(^
是字符串的开始,.
是任何单个字符),所以像你一样使用sub
建议:
ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x)
# [1] "01" "201" "01"
您可以使用 stringi
包中的 stri_sub
:
x = c("101", "201", "301")
y = c("1", "7", "3")
require(stringi)
stri_sub(x, 1 + (stri_sub(x, 1,1)==y))
## [1] "01" "201" "01"
一些基准:
require(microbenchmark)
x <- stri_rand_strings(1000, 20, "[0-9]")
head(x)
## [1] "54144716481937965959" "85386002944985867089" "30205714375670945562" "81644306435633236981"
[5] "88781777748301517606" "13505496126231808763"
## y <- stri_rand_strings(1000, 1, "[0-9]")
head(y)
## [1] "1" "4" "3" "8" "4" "9"
microbenchmark(stri_sub(x, 1 + (stri_sub(x, 1,1)==y)), ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x), substr(x[mapply(`==`, substr(x,1,1), y)],1,1) <- "")
Loading required namespace: multcomp
Unit: microseconds
expr min lq mean median uq max neval
stri_sub(x, 1 + (stri_sub(x, 1, 1) == y)) 154.876 160.4045 201.5347 198.4005 235.128 361.477 100
ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x) 424.915 434.1080 493.5478 446.9575 463.251 1666.774 100
substr(x[mapply(`==`, substr(x, 1, 1), y)], 1, 1) <- "" 4169.437 4272.4095 4590.1717 4476.1615 4673.802 7278.571 100
如果满足条件,我想将x中字符串的第一个元素替换为空白:如果x中“101”的第一个元素与y中的第一个字符串匹配,则将“101”的第一个元素替换为空白.
x = c("101", "201", "301")
y = c("1", "7", "3")
想要:
> x
[1] "01" "201" "01"
我正在尝试:
> ifelse(substr(x, 1, 1) == y, sub(substr(x, 1, 1), ""), x)
我知道这是错误的,但不是直觉上的- sub
需要一个模式作为第一个参数并且不会采用 substr
.
也尝试过:
> ifelse(substr(x, 1, 1) == y, substr(x, 1, 1) <- "", x)
[1] "" "201" ""
我参考了这个 R: How can I replace let's say the 5th element within a string? 并使用
解决了它ifelse(substr(x, 1, 1) == y, paste(substr(x, 2, nchar(x))), x)
想知道是否有更好的方法?
不知道它是否更好,但在这种情况下您始终可以使用 mapply()
:
x <- c("apple", "bog", "cat", "dog")
y <- c('a', 'b', 'b', 'd')
logi <- mapply(`==`, substr(x,1,1), y)
substr(x[logi],1,1) <- ""
x
[1] "pple" "og" "cat" "og"
匹配第一个字符的正则表达式是"^."
(^
是字符串的开始,.
是任何单个字符),所以像你一样使用sub
建议:
ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x)
# [1] "01" "201" "01"
您可以使用 stringi
包中的 stri_sub
:
x = c("101", "201", "301")
y = c("1", "7", "3")
require(stringi)
stri_sub(x, 1 + (stri_sub(x, 1,1)==y))
## [1] "01" "201" "01"
一些基准:
require(microbenchmark)
x <- stri_rand_strings(1000, 20, "[0-9]")
head(x)
## [1] "54144716481937965959" "85386002944985867089" "30205714375670945562" "81644306435633236981"
[5] "88781777748301517606" "13505496126231808763"
## y <- stri_rand_strings(1000, 1, "[0-9]")
head(y)
## [1] "1" "4" "3" "8" "4" "9"
microbenchmark(stri_sub(x, 1 + (stri_sub(x, 1,1)==y)), ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x), substr(x[mapply(`==`, substr(x,1,1), y)],1,1) <- "")
Loading required namespace: multcomp
Unit: microseconds
expr min lq mean median uq max neval
stri_sub(x, 1 + (stri_sub(x, 1, 1) == y)) 154.876 160.4045 201.5347 198.4005 235.128 361.477 100
ifelse(substr(x, 1, 1) == y, sub("^.", "", x), x) 424.915 434.1080 493.5478 446.9575 463.251 1666.774 100
substr(x[mapply(`==`, substr(x, 1, 1), y)], 1, 1) <- "" 4169.437 4272.4095 4590.1717 4476.1615 4673.802 7278.571 100