使用 R 在数据框的字母数字列中放置一个点

Question

我正在努力使用 R 中的 gsub 和正则表达式，我需要这方面的帮助。我在 R 中有一个数据框，第二列代表一些以字母数字形式显示的代码。我想在由四位和五位数字组成的代码中的三个字符后放置一个点。不想碰三字码，我的输入是，

ID	code
1	C443
2	B479
3	E53
4	S9200
5	M8199

我需要的输出是，

ID	code
1	C44.3
2	B47.9
3	E53
4	S92.00
5	M81.99

我正在尝试，但在第 3 个 ID 的代码中也得到了一个点

Library(dplyr)
a <- a %>% mutate(code = as.numeric(paste0(substr(code,1,3),".",substr(code,4,nchar(code)))))

感谢您的帮助

Answer 1

这是一种使用 RegEx 的好方法：

a %>%
  mutate(code = gsub("(^[A-Z][0-9]{2})([0-9]{1,2})", "\1\.\2", code))

Answer 2

您可以在现有代码中添加 if_else。

library(dplyr)

df <-
  data.frame(id = c(1, 2, 3, 4, 5),
             code = c("C443", "B479", "E53", "S9200", "M81999"))
df <-
  df %>% mutate(code = if_else(nchar(code) > 3, paste0(
    substr(code, 1, 3), ".", substr(code, 4, nchar(code))
  ), code))
df
#>   id    code
#> 1  1   C44.3
#> 2  2   B47.9
#> 3  3     E53
#> 4  4  S92.00
#> 5  5 M81.999

^{由 reprex package (v2.0.1)}

于 2021 年 10 月 1 日创建

Answer 3

使用str_replace

library(stringr)
library(dplyr)
df %>% 
    mutate(code = str_replace(code, "(\d{2})(\d+)", "\1.\2"))
  id    code
1  1   C44.3
2  2   B47.9
3  3     E53
4  4  S92.00
5  5 M81.999

数据

df <- structure(list(id = c(1, 2, 3, 4, 5), code = c("C443", "B479", 
"E53", "S9200", "M81999")), class = "data.frame", row.names = c(NA, 
-5L))

使用 R 在数据框的字母数字列中放置一个点

Placing a dot in an alphanumeric column of a dataframe using R

r

gsub

dataframe

dplyr

数据