将一列数值重新编码为 R 中的一列新文本值
Recode a column of numercial values to a new column of text values in R
在 R 中,在数据框中,我想在一列中获取树种的代码编号,并在数据框中创建一个新列,其中包含重新编码的物种名称文本,如下所示。我可以创建一个树名 = 代号的矩阵,但如何将其应用于仅包含数值的长混合列?
> treeco <- c(4, 3, 4, 5, 3, 2, 2, 1, 4)
> spcode <- c("oak" = 1, "ash" = 2, "elm" = 3, "beech" = 4, "hazel" = 5)
> treesp <- data.frame(spcode)
> treesp
species
oak 1
ash 2
elm 3
beech 4
hazel 5
这就是我正在寻找的解决方案:
treeco spcode
1 4 beech
2 3 elm
3 4 beech
4 5 hazel
5 3 elm
6 2 ash
7 2 ash
8 1 oak
9 4 beech
基础 R
data.frame(treeco, answer = names(spcode)[treeco])
# treeco answer
# 1 4 beech
# 2 3 elm
# 3 4 beech
# 4 5 hazel
# 5 3 elm
# 6 2 ash
# 7 2 ash
# 8 1 oak
# 9 4 beech
dplyr
当 column-name 与环境中的一个匹配时可能会有点混乱,因此为了演示,我将在小标题中重命名 treeco
以便清楚使用哪个.
library(dplyr)
tibble(tc = treeco) %>%
mutate(answer = names(spcode)[tc])
# # A tibble: 9 x 2
# tc answer
# <dbl> <chr>
# 1 4 beech
# 2 3 elm
# 3 4 beech
# 4 5 hazel
# 5 3 elm
# 6 2 ash
# 7 2 ash
# 8 1 oak
# 9 4 beech
还有另一种方法可以让您引入多于一个的额外列:join/merge。
treecodes <- data.frame(code = spcode, tree = names(spcode))
set.seed(42)
treecodes$rand <- sample(100, size = nrow(treecodes), replace = TRUE)
treecodes
# code tree rand
# oak 1 oak 49
# ash 2 ash 65
# elm 3 elm 25
# beech 4 beech 74
# hazel 5 hazel 100
trees <- data.frame(code = treeco)
trees
# code
# 1 4
# 2 3
# 3 4
# 4 5
# 5 3
# 6 2
# 7 2
# 8 1
# 9 4
trees %>%
left_join(treecodes, by = "code")
# code tree rand
# 1 4 beech 74
# 2 3 elm 25
# 3 4 beech 74
# 4 5 hazel 100
# 5 3 elm 25
# 6 2 ash 65
# 7 2 ash 65
# 8 1 oak 49
# 9 4 beech 74
有关 joins/merges 的更多信息,请参阅 How to join (merge) data frames (inner, outer, left, right) and What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?。
在 R 中,在数据框中,我想在一列中获取树种的代码编号,并在数据框中创建一个新列,其中包含重新编码的物种名称文本,如下所示。我可以创建一个树名 = 代号的矩阵,但如何将其应用于仅包含数值的长混合列?
> treeco <- c(4, 3, 4, 5, 3, 2, 2, 1, 4)
> spcode <- c("oak" = 1, "ash" = 2, "elm" = 3, "beech" = 4, "hazel" = 5)
> treesp <- data.frame(spcode)
> treesp
species
oak 1
ash 2
elm 3
beech 4
hazel 5
这就是我正在寻找的解决方案:
treeco spcode
1 4 beech
2 3 elm
3 4 beech
4 5 hazel
5 3 elm
6 2 ash
7 2 ash
8 1 oak
9 4 beech
基础 R
data.frame(treeco, answer = names(spcode)[treeco])
# treeco answer
# 1 4 beech
# 2 3 elm
# 3 4 beech
# 4 5 hazel
# 5 3 elm
# 6 2 ash
# 7 2 ash
# 8 1 oak
# 9 4 beech
dplyr
当 column-name 与环境中的一个匹配时可能会有点混乱,因此为了演示,我将在小标题中重命名 treeco
以便清楚使用哪个.
library(dplyr)
tibble(tc = treeco) %>%
mutate(answer = names(spcode)[tc])
# # A tibble: 9 x 2
# tc answer
# <dbl> <chr>
# 1 4 beech
# 2 3 elm
# 3 4 beech
# 4 5 hazel
# 5 3 elm
# 6 2 ash
# 7 2 ash
# 8 1 oak
# 9 4 beech
还有另一种方法可以让您引入多于一个的额外列:join/merge。
treecodes <- data.frame(code = spcode, tree = names(spcode))
set.seed(42)
treecodes$rand <- sample(100, size = nrow(treecodes), replace = TRUE)
treecodes
# code tree rand
# oak 1 oak 49
# ash 2 ash 65
# elm 3 elm 25
# beech 4 beech 74
# hazel 5 hazel 100
trees <- data.frame(code = treeco)
trees
# code
# 1 4
# 2 3
# 3 4
# 4 5
# 5 3
# 6 2
# 7 2
# 8 1
# 9 4
trees %>%
left_join(treecodes, by = "code")
# code tree rand
# 1 4 beech 74
# 2 3 elm 25
# 3 4 beech 74
# 4 5 hazel 100
# 5 3 elm 25
# 6 2 ash 65
# 7 2 ash 65
# 8 1 oak 49
# 9 4 beech 74
有关 joins/merges 的更多信息,请参阅 How to join (merge) data frames (inner, outer, left, right) and What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?。