left_join 在具有不同列名称的 for 循环中
left_join in a for loop with different columns names
我有一个名为 a
的 data.frame 其结构类似于:-
a <- data.frame(X1=c("A", "B", "C", "A", "C", "D"),
X2=c("B", "C", "D", "A", "B", "A"),
X3=c("C", "D", "A", "B", "A", "B")
)
我还有另一组是:-
b <- data.frame(Xn=c("A", "B", "C", "D"),
Feature=c("some", "more", "what", "why"))
我想将集合 b
中的所有 Features
添加到集合 a
,这样 X1
、X2
和 X3
在集合a
中有其对应的特征列。换句话说,集合 a
中的列变为:-
colnames(a) <- c("X1", "X2", "X3", "Features1", "Features2", "Features3")
如何在 for 循环中使用 left_join 来做到这一点?
在 base R 中,我们可以 unlist
a
dataframe 和 match
它与 b$Xn
得到相应的 Feature
值。我们可以 cbind
这个数据框到原始数据框得到最终答案。
temp <- a
temp[] <- b$Feature[match(unlist(temp), b$Xn)]
names(temp) <- paste0('Feature', seq_along(temp))
cbind(a, temp)
# X1 X2 X3 Feature1 Feature2 Feature3
#1 A B C some more what
#2 B C D more what why
#3 C D A what why some
#4 A A B some some more
#5 C B A what more some
#6 D A B why some more
在tidyverse
中,我们可以获取长格式的数据,将数据加入并取回宽格式。
library(dplyr)
library(tidyr)
a %>%
mutate(row = row_number()) %>%
pivot_longer(cols = -row) %>%
left_join(b, by = c('value' = 'Xn')) %>%
select(-value) %>%
pivot_wider(names_from = name, values_from = Feature) %>%
select(-row) %>%
rename_all(~paste0('Feature', seq_along(.))) %>%
bind_cols(a, .)
这可以通过使用 mutate_all
到 recode
a
中的所有列来完成:
library(tidyverse)
a %>%
mutate_all(funs(feat=recode(., !!!set_names(as.character(b$Feature), b$Xn))))
X1 X2 X3 X1_feat X2_feat X3_feat
1 A B C some more what
2 B C D more what why
3 C D A what why some
4 A A B some some more
5 C B A what more some
6 D A B why some more
您可以添加 rename_at
以获得所需的名称:
a %>%
mutate_all(funs(f=recode(., !!!set_names(as.character(b$Feature), b$Xn)))) %>%
rename_at(vars(matches("f")), ~gsub(".([0-9]).*", "Feature\1", .))
X1 X2 X3 Feature1 Feature2 Feature3
1 A B C some more what
2 B C D more what why
3 C D A what why some
4 A A B some some more
5 C B A what more some
6 D A B why some more
我有一个名为 a
的 data.frame 其结构类似于:-
a <- data.frame(X1=c("A", "B", "C", "A", "C", "D"),
X2=c("B", "C", "D", "A", "B", "A"),
X3=c("C", "D", "A", "B", "A", "B")
)
我还有另一组是:-
b <- data.frame(Xn=c("A", "B", "C", "D"),
Feature=c("some", "more", "what", "why"))
我想将集合 b
中的所有 Features
添加到集合 a
,这样 X1
、X2
和 X3
在集合a
中有其对应的特征列。换句话说,集合 a
中的列变为:-
colnames(a) <- c("X1", "X2", "X3", "Features1", "Features2", "Features3")
如何在 for 循环中使用 left_join 来做到这一点?
在 base R 中,我们可以 unlist
a
dataframe 和 match
它与 b$Xn
得到相应的 Feature
值。我们可以 cbind
这个数据框到原始数据框得到最终答案。
temp <- a
temp[] <- b$Feature[match(unlist(temp), b$Xn)]
names(temp) <- paste0('Feature', seq_along(temp))
cbind(a, temp)
# X1 X2 X3 Feature1 Feature2 Feature3
#1 A B C some more what
#2 B C D more what why
#3 C D A what why some
#4 A A B some some more
#5 C B A what more some
#6 D A B why some more
在tidyverse
中,我们可以获取长格式的数据,将数据加入并取回宽格式。
library(dplyr)
library(tidyr)
a %>%
mutate(row = row_number()) %>%
pivot_longer(cols = -row) %>%
left_join(b, by = c('value' = 'Xn')) %>%
select(-value) %>%
pivot_wider(names_from = name, values_from = Feature) %>%
select(-row) %>%
rename_all(~paste0('Feature', seq_along(.))) %>%
bind_cols(a, .)
这可以通过使用 mutate_all
到 recode
a
中的所有列来完成:
library(tidyverse)
a %>%
mutate_all(funs(feat=recode(., !!!set_names(as.character(b$Feature), b$Xn))))
X1 X2 X3 X1_feat X2_feat X3_feat 1 A B C some more what 2 B C D more what why 3 C D A what why some 4 A A B some some more 5 C B A what more some 6 D A B why some more
您可以添加 rename_at
以获得所需的名称:
a %>%
mutate_all(funs(f=recode(., !!!set_names(as.character(b$Feature), b$Xn)))) %>%
rename_at(vars(matches("f")), ~gsub(".([0-9]).*", "Feature\1", .))
X1 X2 X3 Feature1 Feature2 Feature3 1 A B C some more what 2 B C D more what why 3 C D A what why some 4 A A B some some more 5 C B A what more some 6 D A B why some more