删除逗号和/或句点,除非某些条件适用于 R 中的最后一次出现
Remove comma and or period except if certain condition holds for last occurrence in R
我想从字符串中删除所有逗号和句点,除非字符串以逗号(或句点)结尾,后跟一个或两个数字。
一些例子是:
12.345.67 #would become 12345.67
12.345,67 #would become 12345,67
12.345,6 #would become 12345,6
12.345.6 #would become 12345.6
12.345 #would become 12345
1,2.345 #would become 12345
等等
一种解决方案是统计最后一个comma/period之后的字符(nchar(word(x, -1, sep = ',|\.'))
),如果长度大于2,则去掉所有分隔符(gsub(',|\.', '', x)
),否则就第一个 (sub(',|\.', '', x
).
library(stringr)
ifelse(nchar(word(x, -1, sep = ',|\.')) > 2, gsub(',|\.', '', x), sub(',|\.', '', x))
#[1] "12345.67" "12345,67" "12345,6" "12234" "1234" "12.45"
数据
x <- c("12.345.67", "12.345,67", "12.345,6", "1,2.234", "1.234", "1,2.45")
使用与@Sotos 相同的数据的 stringi
解决方案将是:
library(stringi)
第 1 行删除最后一个 ,
或 .
字符,如果后跟超过 2 个字符
第 2 行删除第一个 ,
或 .
个字符,如果还有超过 1 个 ,
或 .
离开
x<-ifelse(stri_locate_last_regex(x,"([,.])")[,2]<(stri_length(x)-2),
stri_replace_last_regex(x,"([,.])",""),x)
x <- if(stri_count_regex(x,"([,.])") > 1){stri_replace_first_regex(x,"([,.])","")}
> x
[1] "12345.67" "12345,67" "12345,6" "12234" "1234" "12.45"
另一种选择是使用否定前瞻语法 ?!
和 perl compatible
正则表达式:
df
# V1
# 1 12.345.67
# 2 12.345,67
# 3 12.345,6
# 4 12.345.6
# 5 12.345
# 6 1,2.345
df$V1 = gsub("[,.](?!\d{1,2}$)", "", df$V1, perl = T)
df # remove , or . except they are followed by 1 or 2 digits at the end of string
# V1
# 1 12345.67
# 2 12345,67
# 3 12345,6
# 4 12345.6
# 5 12345
# 6 12345
我想从字符串中删除所有逗号和句点,除非字符串以逗号(或句点)结尾,后跟一个或两个数字。
一些例子是:
12.345.67 #would become 12345.67
12.345,67 #would become 12345,67
12.345,6 #would become 12345,6
12.345.6 #would become 12345.6
12.345 #would become 12345
1,2.345 #would become 12345
等等
一种解决方案是统计最后一个comma/period之后的字符(nchar(word(x, -1, sep = ',|\.'))
),如果长度大于2,则去掉所有分隔符(gsub(',|\.', '', x)
),否则就第一个 (sub(',|\.', '', x
).
library(stringr)
ifelse(nchar(word(x, -1, sep = ',|\.')) > 2, gsub(',|\.', '', x), sub(',|\.', '', x))
#[1] "12345.67" "12345,67" "12345,6" "12234" "1234" "12.45"
数据
x <- c("12.345.67", "12.345,67", "12.345,6", "1,2.234", "1.234", "1,2.45")
使用与@Sotos 相同的数据的 stringi
解决方案将是:
library(stringi)
第 1 行删除最后一个
,
或.
字符,如果后跟超过 2 个字符第 2 行删除第一个
,
或.
个字符,如果还有超过 1 个,
或.
离开
x<-ifelse(stri_locate_last_regex(x,"([,.])")[,2]<(stri_length(x)-2),
stri_replace_last_regex(x,"([,.])",""),x)
x <- if(stri_count_regex(x,"([,.])") > 1){stri_replace_first_regex(x,"([,.])","")}
> x
[1] "12345.67" "12345,67" "12345,6" "12234" "1234" "12.45"
另一种选择是使用否定前瞻语法 ?!
和 perl compatible
正则表达式:
df
# V1
# 1 12.345.67
# 2 12.345,67
# 3 12.345,6
# 4 12.345.6
# 5 12.345
# 6 1,2.345
df$V1 = gsub("[,.](?!\d{1,2}$)", "", df$V1, perl = T)
df # remove , or . except they are followed by 1 or 2 digits at the end of string
# V1
# 1 12345.67
# 2 12345,67
# 3 12345,6
# 4 12345.6
# 5 12345
# 6 12345