在字符串中用 0 左填充

Question

我正在尝试清理一些数据。这应该很简单，但我正在努力解决这个问题。我想在字符串中保留 1-9，但如果数字大于 10，我不想更改字符串。我一直在使用 gsub()，但我无法做到找到一种方法告诉 R 忽略我要替换的模式中 1 之后的任何值。

df = data.frame("col1" = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                "col2" = c("test 1", "test 2", "test 3", "test 14", "test 15", "test 16", "test 17", "test 18", "test 19", "test 20" ))

> df
   col1    col2
1     1  test 1
2     2  test 2
3     3  test 3
4     4 test 14
5     5 test 15
6     6 test 16
7     7 test 17
8     8 test 18
9     9 test 19
10   10 test 20

# This is what I've been trying without much luck
test <- df %>% 
  mutate(col2 = gsub("test 1", "test 01", col2))

# My result
> test
   col1     col2
1     1  test 01
2     2   test 2
3     3   test 3
4     4 test 014
5     5 test 015
6     6 test 016
7     7 test 017
8     8 test 018
9     9 test 019
10   10  test 20


----------------
> desired
   col1    col2
1     1 test 01
2     2 test 02
3     3 test 03
4     4 test 14
5     5 test 15
6     6 test 16
7     7 test 17
8     8 test 18
9     9 test 19
10   10 test 20

Answer 1

我们可以用parse_number提取数字部分，用sprintf填充2位，同时粘贴前缀'test'

library(dplyr)    
df %>% 
    mutate(col2 = sprintf('test %02d', readr::parse_number(col2)))

-输出

#   col1    col2
#1     1 test 01
#2     2 test 02
#3     3 test 03
#4     4 test 14
#5     5 test 15
#6     6 test 16
#7     7 test 17
#8     8 test 18
#9     9 test 19
#10   10 test 20

或使用 sub，捕获字符串末尾 ($) 的数字 (\d)，后跟 space (\s ), 在替换中，添加一个 space 后跟 0 和捕获组的反向引用 (\1)

with(df, sub("\s(\d)$", " 0\1", col2))
#[1] "test 01" "test 02" "test 03" "test 14" "test 15" 
#[6] "test 16" "test 17" "test 18" "test 19" "test 20"

Answer 2

另一种解决方案，使用 str_pad 和负前瞻 (?!\d) 将填充限制为个位数：

 library(stringr)
 str_pad(sub("test (\d)(?!\d)","test 0\1", df$col2, perl = T), width = 2, side = "left", pad = "0")
 [1] "test 01"      "test 02"      "test 03"      "test test 14" "test test 15" "test test 16"
 [7] "test test 17" "test test 18" "test test 19" "test test 20"

在字符串中用 0 左填充

Left padding with 0 in a string

r

string

gsub