r 将字符串填充到相同长度

r padding strings to same length

经过几个小时搜索应该很简单的内容后,我需要帮助。

我想做的事情: 确保所有字符串都被填充到长度相同的 26 个字符。

数据集:

  library(stringr)

  names <-
  structure(list(
    names = c(
      "A",
      "ABC",
      "ABCDEFG",
      "ABCDEFGHIJKLMNOP",
      "AB",
      "ABCDEFGHI",
      "ABCDEFGHIJKLMNOPQRSTUVWXYZ",
      "ABCDEFGHIJKL",
      "ABCDEFGHIJKLMNOPQR",
      "ABCDEFGHIJKLMNOP",
      "ABCDEFGHIJKLMNO"
    )
  ),
  class = "data.frame",
  row.names = c(NA,-11L))

第一步: 查找最大字符长度和要填充的空格数:

max <- as.numeric(max(nchar(names$names)))
max

n <- as.numeric(nchar(names$names))
n

pad <- max - n
pad


#add columns to the dataset to check how many characters are to be padded for each name

names$max <- as.numeric(max(nchar(names$names)))
names$n <- as.numeric(nchar(names$names))
names$pad <- as.numeric(max - n)

第 2 步:填充

  names$names <-
  str_pad(names$names,
          pad,
          side = "right",
          pad = "0")

但这种方法似乎对我不起作用。有人能指出我正确的方向吗?我得到不同长度的字符串:

                        names max  n pad
1   A000000000000000000000000  26  1  25
2     ABC00000000000000000000  26  3  23
3         ABCDEFG000000000000  26  7  19
4            ABCDEFGHIJKLMNOP  26 16  10
5    AB0000000000000000000000  26  2  24
6           ABCDEFGHI00000000  26  9  17
7  ABCDEFGHIJKLMNOPQRSTUVWXYZ  26 26   0
8              ABCDEFGHIJKL00  26 12  14
9          ABCDEFGHIJKLMNOPQR  26 18   8
10           ABCDEFGHIJKLMNOP  26 16  10
11            ABCDEFGHIJKLMNO  26 15  11

将不胜感激。

这里我们只需要

library(dplyr)
mx <- as.numeric(max(nchar(names$Name)))
names$Name <- str_pad(names$Name, mx, side = "right", pad = "0")
names$Name

-输出

#[1] "A0000000000000000000000000" "ABC00000000000000000000000" "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
#[5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
#[9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000" "ABCDEFGHIJKLMNO00000000000"

注意:最好不要使用函数名或参数名来命名对象

我想你需要格式化功能。您设置宽度,然后左对齐、右对齐或居中对齐:


format(names, width = 26, justify = "left")

# Name
# 1  A                         
# 2  ABC                       
# 3  ABCDEFG                   
# 4  ABCDEFGHIJKLMNOP          
# 5  AB                        
# 6  ABCDEFGHI                 
# 7  ABCDEFGHIJKLMNOPQRSTUVWXYZ
# 8  ABCDEFGHIJKL              
# 9  ABCDEFGHIJKLMNOPQR        
# 10 ABCDEFGHIJKLMNOP          
# 11 ABCDEFGHIJKLMNO           

使用 reppaste(..., collapse="")(类似于 pythong 的 join for vec of strings)和 Vectorize() 并关闭 pad(意味着只是抓取pad from argument list) 可以快速创建一个 pad-string 生成器 reps。 使用 paste0 可以按元素 join 字符向量。

pad_strings <- function(char_vec, max_len=NULL, pad="0") {
  reps <- Vectorize(function(n) paste(rep(pad, n), collapse=""))
  lengths <- nchar(char_vec)
  if (is.null(max_len)) max_len <- max(lengths)
  diffs <- max_len - lengths
  paste0(char_vec, reps(diffs))
}

> pad_strings(char_vec)
 [1] "A0000000000000000000000000" "ABC00000000000000000000000"
 [3] "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
 [5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000"
 [7] "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
 [9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000"
[11] "ABCDEFGHIJKLMNO00000000000"

如果 max_len= 没有给出参数,那么它们将被填充到最长的字符串。否则 pad 将被填充到 max_len.