如何使用来自另一个数据框的信息更新数据框列
How to update dataframe column using information from another dataframe
我有 2 个数据帧:
df1 = data.frame(Bird_ID = c(1:6), Sex = c("Male","Female","Male","Male","Male","UNK"))
df2 = data.frame(Bird_ID = c(6), Seen_sex = c("Female"))
df1
# Bird_ID Sex
# 1 Male
# 2 Female
# 3 Male
# 4 Male
# 5 Male
# 6 UNK
df2
# Bird_ID Seen_Sex
# 6 Female
- 我的第一个数据框 (
df1
) 是我的数据库,里面有我所有的鸟
性别已知。
- 我的第二个数据框 (
df2
) 是“更新程序”
如何使用 df2
中的信息更新 df1
中的鸟 6?所以 df1 中的“UNK”现在应该变成“雌性”,所有其他鸟类保持不变。
我个人更喜欢保留内容并像这样合并列
library(dplyr)
left_join(df1, df2, by= "Bird_ID") %>%
mutate(
Sex = coalesce(Seen_sex, Sex)
) %>%
select(-Seen_sex)
但是您可以通过查找行并覆盖它来更新特定记录。
df1[df2$Bird_ID == df1$Bird_ID,] = df2
使用dplyr
:
library(dplyr)
df1 %>%
left_join(., df2) %>%
mutate(Sex = ifelse(!is.na(Seen_sex), Seen_sex, Sex)) %>%
select(-Seen_sex)
Joining, by = "Bird_ID"
Bird_ID Sex
1 1 Male
2 2 Female
3 3 Male
4 4 Male
5 5 Male
6 6 Female
在base R
中:
df1 <- merge(df1, df2, by = "Bird_ID", all = TRUE)
df1$Sex[!is.na(df1$Seen_sex)] <- df1$Seen_sex[!is.na(df1$Seen_sex)]
df1$Seen_sex <- NULL
使用 dplyr
>= 0.5
版本:
> merge(df1, setNames(df2, c('Bird_ID', 'Sex')), on='Bird_ID', all=T) %>% distinct(Bird_ID, .keep_all=T)
Bird_ID Sex
1 1 Male
2 2 Female
3 3 Male
4 4 Male
5 5 Male
6 6 Female
>
您可以在基数 R 中使用 match
-
df1$Sex[match(df2$Bird_ID, df1$Bird_ID)] <- df2$Seen_sex
df1
# Bird_ID Sex
#1 1 Male
#2 2 Female
#3 3 Male
#4 4 Male
#5 5 Male
#6 6 Female
我有 2 个数据帧:
df1 = data.frame(Bird_ID = c(1:6), Sex = c("Male","Female","Male","Male","Male","UNK"))
df2 = data.frame(Bird_ID = c(6), Seen_sex = c("Female"))
df1
# Bird_ID Sex
# 1 Male
# 2 Female
# 3 Male
# 4 Male
# 5 Male
# 6 UNK
df2
# Bird_ID Seen_Sex
# 6 Female
- 我的第一个数据框 (
df1
) 是我的数据库,里面有我所有的鸟 性别已知。 - 我的第二个数据框 (
df2
) 是“更新程序”
如何使用 df2
中的信息更新 df1
中的鸟 6?所以 df1 中的“UNK”现在应该变成“雌性”,所有其他鸟类保持不变。
我个人更喜欢保留内容并像这样合并列
library(dplyr)
left_join(df1, df2, by= "Bird_ID") %>%
mutate(
Sex = coalesce(Seen_sex, Sex)
) %>%
select(-Seen_sex)
但是您可以通过查找行并覆盖它来更新特定记录。
df1[df2$Bird_ID == df1$Bird_ID,] = df2
使用dplyr
:
library(dplyr)
df1 %>%
left_join(., df2) %>%
mutate(Sex = ifelse(!is.na(Seen_sex), Seen_sex, Sex)) %>%
select(-Seen_sex)
Joining, by = "Bird_ID"
Bird_ID Sex
1 1 Male
2 2 Female
3 3 Male
4 4 Male
5 5 Male
6 6 Female
在base R
中:
df1 <- merge(df1, df2, by = "Bird_ID", all = TRUE)
df1$Sex[!is.na(df1$Seen_sex)] <- df1$Seen_sex[!is.na(df1$Seen_sex)]
df1$Seen_sex <- NULL
使用 dplyr
>= 0.5
版本:
> merge(df1, setNames(df2, c('Bird_ID', 'Sex')), on='Bird_ID', all=T) %>% distinct(Bird_ID, .keep_all=T)
Bird_ID Sex
1 1 Male
2 2 Female
3 3 Male
4 4 Male
5 5 Male
6 6 Female
>
您可以在基数 R 中使用 match
-
df1$Sex[match(df2$Bird_ID, df1$Bird_ID)] <- df2$Seen_sex
df1
# Bird_ID Sex
#1 1 Male
#2 2 Female
#3 3 Male
#4 4 Male
#5 5 Male
#6 6 Female