根据 R 中先前的数字列向数据框添加新的字符串列

Question

我有一个包含 6 个不同物种的 400,000 棵树的数据框。每个物种都分配有一个与特定物种相对应的数字物种代码。我想添加另一列，列出每棵树的学名。物种代码不是连续的，因为该数据是根据丰度从 163 个物种的 490,000 棵树中筛选出来的。这是一个类似于我所拥有的数据的示例：

Index    Age    Species_code
0        45     14
1        47     32
2        14     62
3        78     126
4        40     14
5        38     17 
6        28     47

这是我想要达到的目标的示例：

Index    Age    Species_code    Species
0        45     14              Licania_heteromorpha
1        47     32              Pouteria_reticulata
2        14     62              Chrysophyllum_cuneifolium
3        78     126             Eperua_falcata
4        40     14              Licania_heteromorpha
5        38     17              Simaba_cedron
6        28     47              Sterculia_pruriens

我一直在尝试

if (Species_code == 14)
{
}

但是，这在输出中给了我 TRUE 或 FALSE

Answer 1

您可能需要使用 ifelse() 函数。

您可能还想使用：

my_names <- numeric()
my_names[47] <- "Licania_heteromorpha"
my_names[63] <- "Chrysophyllum_cuneifolium"
...
df$Species <- names[df$Species_code]

您还可以查看 dplyr 的许多函数，例如 case_when 和 recode。参见：https://dplyr.tidyverse.org/reference.

Answer 2

由于你的问题只有6种，你可以这样做：

df$Species = NULL

df$Species[df$Species_code == 14] = 'Licania_heteromorpha'
df$Species[df$Species_code == 32] = 'Pouteria_reticulata'
.....

Answer 3

一个解决方案是将 mutate 与 case_when 结合使用，如果您知道哪些数字对应于哪些物种，我已经填写了其中的一些，给出了后续代码：

library(tidyverse)
x <-"
  Index    Age    Species_code
0        45     14
1        47     32
2        14     62
3        78     126
4        40     14
5        38     17 
6        28     47"
y <- read.table(text = x, header = TRUE)
y <- y %>% 
  mutate(species = case_when(Species_code == 14 ~ "Licania_heteromorpha",
                             Species_code == 32 ~ "Pouteria_reticulata",
                             Species_code == 62 ~"Chrysophyllum_cuneifolium"))   #etc...
y
#   Index Age Species_code                   species
# 1     0  45           14      Licania_heteromorpha
# 2     1  47           32       Pouteria_reticulata
# 3     2  14           62 Chrysophyllum_cuneifolium
# 4     3  78          126                      <NA>
# 5     4  40           14      Licania_heteromorpha
# 6     5  38           17                      <NA>
# 7     6  28           47                      <NA>

尽管如果您有一个单独的物种和代码数据集，合并会更有意义。

根据 R 中先前的数字列向数据框添加新的字符串列

Adding a new string column to a dataframe based on a previous numeric column in R

string

r

data-manipulation

plyr

dataframe