rename_with dplyr 后的意外输出
Unexpected output after rename_with dplyr
这个问题与有关,基本上最后一位数字1到9应该重命名为01-09:
Ronak Shah 提供的解决方案给出了这个:
df <- structure(list(Id = c(1L, 2L, 1L, 4L), Date = c("02/19/2020",
"02/10/2020", "03/11/2020", "10/29/2020"), Col_a_1 = c(0L, 1L,
2L, 1L), Col_a_2 = c(1L, 2L, 1L, 0L), Col_a_3 = c(2L, 0L, 3L,
2L), Col_a_12 = c(0L, 1L, 1L, 1L), Col_a_65 = c(4L, 3L, 0L, 0L
)), class = "data.frame", row.names = c(NA, -4L))
library(dplyr)
library(stringr)
df %>%
rename_with(~str_replace(., '\d+', function(m) sprintf('%02s', m)),
starts_with('Col'))
# Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
#1 1 02/19/2020 0 1 2 0 4
#2 2 02/10/2020 1 2 0 1 3
#3 1 03/11/2020 2 1 3 1 0
#4 4 10/29/2020 1 0 2 1 0
使用相同的数据和相同的代码,我得到 space 而不是零:
Id Date Col_a_ 1 Col_a_ 2 Col_a_ 3 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0
我的会话信息:
sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.0.2
[6] tidyr_1.1.4 tibble_3.1.5 ggplot2_3.3.5 tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 cellranger_1.1.0 pillar_1.6.3 compiler_4.1.1
[5] dbplyr_2.1.1 tools_4.1.1 jsonlite_1.7.2 lubridate_1.8.0
[9] lifecycle_1.0.1 gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.11
[13] reprex_2.0.1 cli_3.0.1 rstudioapi_0.13 DBI_1.1.1
[17] haven_2.4.3 xml2_1.3.2 withr_2.4.2 httr_1.4.2
[21] fs_1.5.0 generics_0.1.0 vctrs_0.3.8 hms_1.1.1
[25] grid_4.1.1 tidyselect_1.1.1 glue_1.4.2 R6_2.5.1
[29] fansi_0.5.0 readxl_1.3.1 tzdb_0.1.2 modelr_0.1.8
[33] magrittr_2.0.1 backports_1.2.1 scales_1.1.1 ellipsis_0.3.2
[37] rvest_1.0.1 assertthat_0.2.1 colorspace_2.0-2 utf8_1.2.2
[41] stringi_1.7.5 munsell_0.5.0 broom_0.7.9 crayon_1.4.1
造成这种行为的原因是什么?
可以用str_pad
,不用区分字符和数字
library(dplyr)
library(stringr)
df %>%
rename_with(~ str_replace(., '\d+', function(m) str_pad(m, width = 2, pad = '0')))
Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0
在 R 4.1.1
macOS
中,尽管 OP 的代码有效
df %>%
rename_with(~str_replace(., '\d+', function(m) sprintf('%02s', m)),
starts_with('Col'))
Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0
这个问题与
Ronak Shah 提供的解决方案给出了这个:
df <- structure(list(Id = c(1L, 2L, 1L, 4L), Date = c("02/19/2020",
"02/10/2020", "03/11/2020", "10/29/2020"), Col_a_1 = c(0L, 1L,
2L, 1L), Col_a_2 = c(1L, 2L, 1L, 0L), Col_a_3 = c(2L, 0L, 3L,
2L), Col_a_12 = c(0L, 1L, 1L, 1L), Col_a_65 = c(4L, 3L, 0L, 0L
)), class = "data.frame", row.names = c(NA, -4L))
library(dplyr)
library(stringr)
df %>%
rename_with(~str_replace(., '\d+', function(m) sprintf('%02s', m)),
starts_with('Col'))
# Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
#1 1 02/19/2020 0 1 2 0 4
#2 2 02/10/2020 1 2 0 1 3
#3 1 03/11/2020 2 1 3 1 0
#4 4 10/29/2020 1 0 2 1 0
使用相同的数据和相同的代码,我得到 space 而不是零:
Id Date Col_a_ 1 Col_a_ 2 Col_a_ 3 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0
我的会话信息:
sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.0.2
[6] tidyr_1.1.4 tibble_3.1.5 ggplot2_3.3.5 tidyverse_1.3.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 cellranger_1.1.0 pillar_1.6.3 compiler_4.1.1
[5] dbplyr_2.1.1 tools_4.1.1 jsonlite_1.7.2 lubridate_1.8.0
[9] lifecycle_1.0.1 gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.11
[13] reprex_2.0.1 cli_3.0.1 rstudioapi_0.13 DBI_1.1.1
[17] haven_2.4.3 xml2_1.3.2 withr_2.4.2 httr_1.4.2
[21] fs_1.5.0 generics_0.1.0 vctrs_0.3.8 hms_1.1.1
[25] grid_4.1.1 tidyselect_1.1.1 glue_1.4.2 R6_2.5.1
[29] fansi_0.5.0 readxl_1.3.1 tzdb_0.1.2 modelr_0.1.8
[33] magrittr_2.0.1 backports_1.2.1 scales_1.1.1 ellipsis_0.3.2
[37] rvest_1.0.1 assertthat_0.2.1 colorspace_2.0-2 utf8_1.2.2
[41] stringi_1.7.5 munsell_0.5.0 broom_0.7.9 crayon_1.4.1
造成这种行为的原因是什么?
可以用str_pad
,不用区分字符和数字
library(dplyr)
library(stringr)
df %>%
rename_with(~ str_replace(., '\d+', function(m) str_pad(m, width = 2, pad = '0')))
Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0
在 R 4.1.1
macOS
中,尽管 OP 的代码有效
df %>%
rename_with(~str_replace(., '\d+', function(m) sprintf('%02s', m)),
starts_with('Col'))
Id Date Col_a_01 Col_a_02 Col_a_03 Col_a_12 Col_a_65
1 1 02/19/2020 0 1 2 0 4
2 2 02/10/2020 1 2 0 1 3
3 1 03/11/2020 2 1 3 1 0
4 4 10/29/2020 1 0 2 1 0