提取列名称字符
Extract column names character
我有一个数据集 dtAU,其中列名如下:
...
"SUPG..SU.Product.Group"
"SUPC..SU.Industry.Code"
"SU_CAT..SU.Category"
"FREQUENCY..Frequency"
"TIME_PERIOD..Time.Period"
...
我想获取“SUPG”、“SUPC”等列名...因此只提取“..”之前的字符并将它们指定为列名。
当我尝试这个时
test <- str_split(colnames(dtAU), "[..]")
我得到了:
List of 11
$ : chr [1:3] "ï" "" "DATAFLOW"
$ : chr [1:5] "SUPG" "" "SU" "Product" ...
$ : chr [1:5] "SUPC" "" "SU" "Industry" ...
$ : chr [1:4] "SU_CAT" "" "SU" "Category"
$ : chr [1:3] "FREQUENCY" "" "Frequency"
$ : chr [1:4] "TIME_PERIOD" "" "Time" "Period"
$ : chr "OBS_VALUE"
$ : chr [1:4] "UNIT_MEASURE" "" "Observation" "Comment"
$ : chr [1:5] "UNIT_MULT" "" "Unit" "of" ...
$ : chr [1:4] "OBS_STATUS" "" "Observation" "Comment"
$ : chr [1:4] "OBS_COMMENT" "" "Observation" "Comment"
但我不知道如何检索每个字符链的第一部分作为列名
可能的解决方案;
library(tidyverse)
n <- c("SUPG..SU.Product.Group",
"SUPC..SU.Industry.Code",
"SU_CAT..SU.Category",
"FREQUENCY..Frequency",
"TIME_PERIOD..Time.Period")
n %>%
str_remove("\.\..*")
#> [1] "SUPG" "SUPC" "SU_CAT" "FREQUENCY" "TIME_PERIOD"
现在,要将新的 colnames 分配给您的数据框 dtAU
,只需执行以下操作:
names(dtAU) <- names(dtAU) %>% str_remove("\.\..*")
你可以这样做:
gsub('[..].*', '', names(dtAU)) -> names(dtAU)
gsub('\..*', '', names(dtAU)) -> names(dtAU)
或者如果你想使用 strsplit
:
sapply(strsplit(names(dtAU), split = '\.'), `[[`, 1) -> names(dtAU)
我有一个数据集 dtAU,其中列名如下:
...
"SUPG..SU.Product.Group"
"SUPC..SU.Industry.Code"
"SU_CAT..SU.Category"
"FREQUENCY..Frequency"
"TIME_PERIOD..Time.Period"
...
我想获取“SUPG”、“SUPC”等列名...因此只提取“..”之前的字符并将它们指定为列名。
当我尝试这个时
test <- str_split(colnames(dtAU), "[..]")
我得到了:
List of 11
$ : chr [1:3] "ï" "" "DATAFLOW"
$ : chr [1:5] "SUPG" "" "SU" "Product" ...
$ : chr [1:5] "SUPC" "" "SU" "Industry" ...
$ : chr [1:4] "SU_CAT" "" "SU" "Category"
$ : chr [1:3] "FREQUENCY" "" "Frequency"
$ : chr [1:4] "TIME_PERIOD" "" "Time" "Period"
$ : chr "OBS_VALUE"
$ : chr [1:4] "UNIT_MEASURE" "" "Observation" "Comment"
$ : chr [1:5] "UNIT_MULT" "" "Unit" "of" ...
$ : chr [1:4] "OBS_STATUS" "" "Observation" "Comment"
$ : chr [1:4] "OBS_COMMENT" "" "Observation" "Comment"
但我不知道如何检索每个字符链的第一部分作为列名
可能的解决方案;
library(tidyverse)
n <- c("SUPG..SU.Product.Group",
"SUPC..SU.Industry.Code",
"SU_CAT..SU.Category",
"FREQUENCY..Frequency",
"TIME_PERIOD..Time.Period")
n %>%
str_remove("\.\..*")
#> [1] "SUPG" "SUPC" "SU_CAT" "FREQUENCY" "TIME_PERIOD"
现在,要将新的 colnames 分配给您的数据框 dtAU
,只需执行以下操作:
names(dtAU) <- names(dtAU) %>% str_remove("\.\..*")
你可以这样做:
gsub('[..].*', '', names(dtAU)) -> names(dtAU)
gsub('\..*', '', names(dtAU)) -> names(dtAU)
或者如果你想使用 strsplit
:
sapply(strsplit(names(dtAU), split = '\.'), `[[`, 1) -> names(dtAU)