如何使用从 R 中现有列中提取的名称将列添加到 data.frame?
How add a column to a data.frame with name extracted from an existing column in R?
我有 DF
data.frame
。我想添加另一个 column
(i.e., call it station_no)
,它将 extrac
t number
在 Variables column
的 underscore
之后 number
。
library(lubridate)
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1979-01-01"), to = as.Date("1979-12-31"), by = "day"),
Grid_2 = runif(365,1,10), Grid_20 = runif(365,5,15)) %>%
pivot_longer(-Date, names_to = "Variables", values_to = "Values")
期望输出:
DF_out <- data.frame(Date = c("1979-01-01","1979-01-01"),Variables = c("Grid_2","Grid_20"),
Values = c(0.95,1.3), Station_no = c(2,20))
简单的选项是 parse_number
其中 returns 数值转换值
library(dplyr)
DF %>%
mutate(Station_no = readr::parse_number(Variables))
或使用 str_extract
(如果我们想按模式进行)
library(stringr)
DF %>%
mutate(Station_no = str_extract(Variables, "(?<=_)\d+"))
或使用base R
DF$Station_no <- trimws(DF$Variables, whitespace = '\D+')
一个base R
解决方案是:
#Code
DF$Station_no <- sub("^[^_]*_", "", DF$Variables)
输出(一些行):
# A tibble: 730 x 4
Date Variables Values Station_no
<date> <chr> <dbl> <chr>
1 1979-01-01 Grid_2 3.59 2
2 1979-01-01 Grid_20 12.8 20
3 1979-01-02 Grid_2 8.09 2
4 1979-01-02 Grid_20 6.93 20
5 1979-01-03 Grid_2 4.68 2
6 1979-01-03 Grid_20 5.18 20
7 1979-01-04 Grid_2 8.95 2
8 1979-01-04 Grid_20 9.07 20
9 1979-01-05 Grid_2 9.46 2
10 1979-01-05 Grid_20 9.83 20
# ... with 720 more rows
我有 DF
data.frame
。我想添加另一个 column
(i.e., call it station_no)
,它将 extrac
t number
在 Variables column
的 underscore
之后 number
。
library(lubridate)
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1979-01-01"), to = as.Date("1979-12-31"), by = "day"),
Grid_2 = runif(365,1,10), Grid_20 = runif(365,5,15)) %>%
pivot_longer(-Date, names_to = "Variables", values_to = "Values")
期望输出:
DF_out <- data.frame(Date = c("1979-01-01","1979-01-01"),Variables = c("Grid_2","Grid_20"),
Values = c(0.95,1.3), Station_no = c(2,20))
简单的选项是 parse_number
其中 returns 数值转换值
library(dplyr)
DF %>%
mutate(Station_no = readr::parse_number(Variables))
或使用 str_extract
(如果我们想按模式进行)
library(stringr)
DF %>%
mutate(Station_no = str_extract(Variables, "(?<=_)\d+"))
或使用base R
DF$Station_no <- trimws(DF$Variables, whitespace = '\D+')
一个base R
解决方案是:
#Code
DF$Station_no <- sub("^[^_]*_", "", DF$Variables)
输出(一些行):
# A tibble: 730 x 4
Date Variables Values Station_no
<date> <chr> <dbl> <chr>
1 1979-01-01 Grid_2 3.59 2
2 1979-01-01 Grid_20 12.8 20
3 1979-01-02 Grid_2 8.09 2
4 1979-01-02 Grid_20 6.93 20
5 1979-01-03 Grid_2 4.68 2
6 1979-01-03 Grid_20 5.18 20
7 1979-01-04 Grid_2 8.95 2
8 1979-01-04 Grid_20 9.07 20
9 1979-01-05 Grid_2 9.46 2
10 1979-01-05 Grid_20 9.83 20
# ... with 720 more rows