在 R 中查找 table 引用数据框中的行值和特定列
Look up table in R referencing row values and specific columns in a dataframe
我在 R 中有一个多部分查找 table 问题。
我有一个数据框,其中每列中的数字代表一个项目名称。物品名称可以在对应的查找中找到table.
数据:
> food.dat
Fruit Vegetable Meat Dairy
1 1 2 2 3
2 3 2 1 1
3 3 2 2 2
4 2 2 1 1
5 1 1 1 2
查找Table:
> food.lookup
FoodItem Number FoodName
1 Fruit 1 Banana
2 Fruit 2 Apple
3 Fruit 3 Mango
4 Vegetable 1 Carrot
5 Vegetable 2 Broccoli
6 Meat 1 Chicken
7 Meat 2 Fish
8 Dairy 1 Cheese
9 Dairy 2 Yogurt
10 Dairy 3 IceCream
请注意,这个数字在食物中并不是唯一的。例如,1 表示 Fruit (Banana) 列中的不同 FoodName 和 Vegetable (Carrot) 列中的不同 FoodName。
我想重新编码 food.dat 数据框以从查找 table 中获取 FoodName 值。
如果可能的话,我还希望能够使用一个简单的函数并提供一个 FoodName 和 return 来自 food.dat 的数据框,其中仅包含包含指定 FoodName 的行。
感谢您的宝贵时间和想法:)
split
由 'FoodItem' 命名的 vector
从 'food.lookup' 变成了 list
。循环 across
'food.dat' 列,提取 list
元素并通过匹配
替换值
library(dplyr)
lst1 <- with(food.lookup, split(setNames(FoodName, Number), FoodItem))
food.dat %>%
mutate(across(all_of(names(lst1)), ~ lst1[[cur_column()]][as.character(.)]))
-输出
Fruit Vegetable Meat Dairy
1 Banana Broccoli Fish IceCream
2 Mango Broccoli Chicken Cheese
3 Mango Broccoli Fish Yogurt
4 Apple Broccoli Chicken Cheese
5 Banana Carrot Chicken Yogurt
数据
food.dat <- structure(list(Fruit = c(1L, 3L, 3L, 2L, 1L), Vegetable = c(2L,
2L, 2L, 2L, 1L), Meat = c(2L, 1L, 2L, 1L, 1L), Dairy = c(3L,
1L, 2L, 1L, 2L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5"))
food.lookup <- structure(list(FoodItem = c("Fruit", "Fruit",
"Fruit", "Vegetable",
"Vegetable", "Meat", "Meat", "Dairy", "Dairy", "Dairy"), Number = c(1L,
2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L), FoodName = c("Banana", "Apple",
"Mango", "Carrot", "Broccoli", "Chicken", "Fish", "Cheese", "Yogurt",
"IceCream")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))
类似地,您可以利用不同名称的“位置”。
为此,将循环 table 拆分为相应的食物类型(或手动输入)。然后简单地使用索引设置结果。
下面做一个例子。您可以轻松地将其扩展到所有人。
我将结果存储在 Dairy2 中,因此您可以比较并查看索引的工作原理。
dairy <- c("Cheese","Yogurt","IceCream")
food.dat <- data.frame(Dairy = c(3,1,2,1,2))
food.dat$Dairy2 = dairy[food.dat$Dairy]
food.dat
Dairy Dairy2
1 3 IceCream
2 1 Cheese
3 2 Yogurt
4 1 Cheese
5 2 Yogurt
我们可以将数据转换为长格式,逐行显示一个食物,加入查找 table,然后转换回宽格式
library(tidyr)
library(dplyr)
food.dat %>%
tibble::rowid_to_column() %>%
pivot_longer(-rowid, names_to = "FoodItem",
values_to = "Number") %>%
left_join(food.lookup) %>%
pivot_wider(id_cols = rowid, names_from = FoodItem,
values_from = FoodName)
#> # A tibble: 5 x 5
#> rowid Fruit Vegetable Meat Dairy
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 Banana Broccoli Fish IceCream
#> 2 2 Mango Broccoli Chicken Cheese
#> 3 3 Mango Broccoli Fish Yogurt
#> 4 4 Apple Broccoli Chicken Cheese
#> 5 5 Banana Carrot Chicken Yogurt
有数据:
food.dat <- read.table(text =
'Fruit Vegetable Meat Dairy
1 2 2 3
3 2 1 1
3 2 2 2
2 2 1 1
1 1 1 2', header = TRUE)
food.lookup <- read.table(text =
'FoodItem Number FoodName
Fruit 1 Banana
Fruit 2 Apple
Fruit 3 Mango
Vegetable 1 Carrot
Vegetable 2 Broccoli
Meat 1 Chicken
Meat 2 Fish
Dairy 1 Cheese
Dairy 2 Yogurt
Dairy 3 IceCream', header = TRUE)
我在 R 中有一个多部分查找 table 问题。 我有一个数据框,其中每列中的数字代表一个项目名称。物品名称可以在对应的查找中找到table.
数据:
> food.dat
Fruit Vegetable Meat Dairy
1 1 2 2 3
2 3 2 1 1
3 3 2 2 2
4 2 2 1 1
5 1 1 1 2
查找Table:
> food.lookup
FoodItem Number FoodName
1 Fruit 1 Banana
2 Fruit 2 Apple
3 Fruit 3 Mango
4 Vegetable 1 Carrot
5 Vegetable 2 Broccoli
6 Meat 1 Chicken
7 Meat 2 Fish
8 Dairy 1 Cheese
9 Dairy 2 Yogurt
10 Dairy 3 IceCream
请注意,这个数字在食物中并不是唯一的。例如,1 表示 Fruit (Banana) 列中的不同 FoodName 和 Vegetable (Carrot) 列中的不同 FoodName。
我想重新编码 food.dat 数据框以从查找 table 中获取 FoodName 值。 如果可能的话,我还希望能够使用一个简单的函数并提供一个 FoodName 和 return 来自 food.dat 的数据框,其中仅包含包含指定 FoodName 的行。
感谢您的宝贵时间和想法:)
split
由 'FoodItem' 命名的 vector
从 'food.lookup' 变成了 list
。循环 across
'food.dat' 列,提取 list
元素并通过匹配
library(dplyr)
lst1 <- with(food.lookup, split(setNames(FoodName, Number), FoodItem))
food.dat %>%
mutate(across(all_of(names(lst1)), ~ lst1[[cur_column()]][as.character(.)]))
-输出
Fruit Vegetable Meat Dairy
1 Banana Broccoli Fish IceCream
2 Mango Broccoli Chicken Cheese
3 Mango Broccoli Fish Yogurt
4 Apple Broccoli Chicken Cheese
5 Banana Carrot Chicken Yogurt
数据
food.dat <- structure(list(Fruit = c(1L, 3L, 3L, 2L, 1L), Vegetable = c(2L,
2L, 2L, 2L, 1L), Meat = c(2L, 1L, 2L, 1L, 1L), Dairy = c(3L,
1L, 2L, 1L, 2L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5"))
food.lookup <- structure(list(FoodItem = c("Fruit", "Fruit",
"Fruit", "Vegetable",
"Vegetable", "Meat", "Meat", "Dairy", "Dairy", "Dairy"), Number = c(1L,
2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L), FoodName = c("Banana", "Apple",
"Mango", "Carrot", "Broccoli", "Chicken", "Fish", "Cheese", "Yogurt",
"IceCream")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))
类似地,您可以利用不同名称的“位置”。 为此,将循环 table 拆分为相应的食物类型(或手动输入)。然后简单地使用索引设置结果。
下面做一个例子。您可以轻松地将其扩展到所有人。 我将结果存储在 Dairy2 中,因此您可以比较并查看索引的工作原理。
dairy <- c("Cheese","Yogurt","IceCream")
food.dat <- data.frame(Dairy = c(3,1,2,1,2))
food.dat$Dairy2 = dairy[food.dat$Dairy]
food.dat
Dairy Dairy2
1 3 IceCream
2 1 Cheese
3 2 Yogurt
4 1 Cheese
5 2 Yogurt
我们可以将数据转换为长格式,逐行显示一个食物,加入查找 table,然后转换回宽格式
library(tidyr)
library(dplyr)
food.dat %>%
tibble::rowid_to_column() %>%
pivot_longer(-rowid, names_to = "FoodItem",
values_to = "Number") %>%
left_join(food.lookup) %>%
pivot_wider(id_cols = rowid, names_from = FoodItem,
values_from = FoodName)
#> # A tibble: 5 x 5
#> rowid Fruit Vegetable Meat Dairy
#> <int> <chr> <chr> <chr> <chr>
#> 1 1 Banana Broccoli Fish IceCream
#> 2 2 Mango Broccoli Chicken Cheese
#> 3 3 Mango Broccoli Fish Yogurt
#> 4 4 Apple Broccoli Chicken Cheese
#> 5 5 Banana Carrot Chicken Yogurt
有数据:
food.dat <- read.table(text =
'Fruit Vegetable Meat Dairy
1 2 2 3
3 2 1 1
3 2 2 2
2 2 1 1
1 1 1 2', header = TRUE)
food.lookup <- read.table(text =
'FoodItem Number FoodName
Fruit 1 Banana
Fruit 2 Apple
Fruit 3 Mango
Vegetable 1 Carrot
Vegetable 2 Broccoli
Meat 1 Chicken
Meat 2 Fish
Dairy 1 Cheese
Dairy 2 Yogurt
Dairy 3 IceCream', header = TRUE)