根据R中现有列中的值将列添加到数据框
adding columns to dataframe based on the values in existing column in R
我在 Rstudio 工作,有一个类似于以下的数据框:
Favorite<-c("Apple","Lemon","Orange","Salat","Onion", "Apple","Strawberry","Celery","Blueberry","Sweetpotatoes","Strawberry",
"Oragne","Celery","Sweetpotatoes","Onion","Blueberry","Strawberry","Salad")
PersonID<-c(67,82,67,21,02,12,90,23,65,32,44,67,56,77,30,198,20,99)
all_Data<-data.frame(PersonID,Favorite)
> head(all_Data)
PersonID Favorite
1 67 Apple
2 82 Lemon
3 67 Orange
4 21 Salat
5 2 Onion
6 12 Apple
我想再添加 3 个列,它们应该包含以下内容:
如果 all_Data$Favorite 中的一行是 Apple 或 Blueberry,则 all_Data$Country = Ireand,all_Data$Continent= Europe 和 all_Data$city=Belfast
如果 all_Data$Favorite 中的一行是 Strawberry,那么 all_Data$Country= Holland,all_Data$Continent= Europe 和 all_Data$city=Emmen
如果 all_Data$Favorite 中的一行是 Lemon 或 Orange,则 all_Data$Country= France,all_Data$Continent= Europe 和 all_Data$city=Menton
如果 all_Data$Favorite 中的一行是沙拉或洋葱,那么 all_Data$Country= Sweeden,all_Data$Continent= Europe 和 all_Data$city=Malmoe
如果 all_Data$Favorite 中的一行是 Lemon 或 Orange,则 all_Data$Country= France,all_Data$Continent= Europe 和 all_Data$city=Menton
如果 all_Data$Favorite 中的一行是 Sweetpotatoes,那么 all_Data$Country= USA,all_Data$Continent= America 和 all_Data$city=Verona
如果 all_Data$Favorite 中的一行是芹菜,那么 all_Data$Country= 德国,all_Data$Continent= 欧洲和 all_Data$city=柏林
library(tidyverse)
all_Data |>
mutate(ctry_cont = case_when(
str_detect(Favorite, "Appl|Blueb") ~ "Ireland|Europe",
str_detect(Favorite, "Straw") ~ "Brazillian|South's of America",
str_detect(Favorite, "Lemon|Orang") ~ "France|Europe",
str_detect(Favorite, "Salad|Onion") ~ "Sweden|Europe",
str_detect(Favorite, "Sweetpot") ~ "United of state|America",
str_detect(Favorite, "Celery") ~ "Germany|Europe",
TRUE ~ "Other|Other"
)) |>
separate(ctry_cont, c("country", "continent"))
在 运行 上面的代码之后,我得到以下警告和数据,其中我们看到了英国和美国的一半价值。我还添加了带撇号的单词,因为在我的原始数据中有带撇号的单词,但它也不可见:
PersonID Favorite country continent
1 67 Apple Ireland Europe
2 82 Lemon France Europe
3 67 Orange France Europe
4 21 Salat Other Other
5 2 Onion Sweden Europe
6 12 Apple Ireland Europe
7 90 Strawberry Brazillian South
8 23 Celery Germany Europe
9 65 Blueberry Ireland Europe
10 32 Sweetpotatoes United of
11 44 Strawberry Brazillian South
12 67 Oragne Other Other
13 56 Celery Germany Europe
14 77 Sweetpotatoes United of
15 30 Onion Sweden Europe
16 198 Blueberry Ireland Europe
17 20 Strawberry Brazillian South
18 99 Salad Sweden Europe
Warning message:
Expected 2 pieces. Additional pieces discarded in 5 rows [7, 10, 11, 14, 17].
我还尝试在代码的最后一步添加 sep=""。它给出了一个错误。
separate(ctry_cont, c("country", "continent"), sep="")
你可以这样做...
Favorite <- c(
"Apple",
"Lemon",
"Orange",
"Salad",
"Onion",
"Apple",
"Strawberry",
"Celery",
"Blueberry",
"Sweetpotatoes",
"Strawberry",
"Orange",
"Celery",
"Sweetpotatoes",
"Onion",
"Blueberry",
"Strawberry",
"Salad"
)
PersonID <-
c(67, 82, 67, 21, 02, 12, 90, 23, 65, 32, 44, 67, 56, 77, 30, 198, 20, 99)
all_Data <- data.frame(PersonID, Favorite)
library(tidyverse)
all_Data |>
mutate(ctry_cont = case_when(
str_detect(Favorite, "Appl|Blueb") ~ "Ireland, Europe",
str_detect(Favorite, "Straw") ~ "Holland, Europe",
str_detect(Favorite, "Lemon|Orang") ~ "France, Europe",
str_detect(Favorite, "Salad|Onion") ~ "Sweden, Europe",
str_detect(Favorite, "Sweetpot") ~ "United States, North America",
str_detect(Favorite, "Celery") ~ "Germany, Europe",
TRUE ~ "Other, Other"
)) |>
separate(ctry_cont, c("country", "continent"), sep = ", ")
#> PersonID Favorite country continent
#> 1 67 Apple Ireland Europe
#> 2 82 Lemon France Europe
#> 3 67 Orange France Europe
#> 4 21 Salad Sweden Europe
#> 5 2 Onion Sweden Europe
#> 6 12 Apple Ireland Europe
#> 7 90 Strawberry Holland Europe
#> 8 23 Celery Germany Europe
#> 9 65 Blueberry Ireland Europe
#> 10 32 Sweetpotatoes United States North America
#> 11 44 Strawberry Holland Europe
#> 12 67 Orange France Europe
#> 13 56 Celery Germany Europe
#> 14 77 Sweetpotatoes United States North America
#> 15 30 Onion Sweden Europe
#> 16 198 Blueberry Ireland Europe
#> 17 20 Strawberry Holland Europe
#> 18 99 Salad Sweden Europe
由 reprex package (v2.0.1)
创建于 2022-04-22
我在 Rstudio 工作,有一个类似于以下的数据框:
Favorite<-c("Apple","Lemon","Orange","Salat","Onion", "Apple","Strawberry","Celery","Blueberry","Sweetpotatoes","Strawberry",
"Oragne","Celery","Sweetpotatoes","Onion","Blueberry","Strawberry","Salad")
PersonID<-c(67,82,67,21,02,12,90,23,65,32,44,67,56,77,30,198,20,99)
all_Data<-data.frame(PersonID,Favorite)
> head(all_Data)
PersonID Favorite
1 67 Apple
2 82 Lemon
3 67 Orange
4 21 Salat
5 2 Onion
6 12 Apple
我想再添加 3 个列,它们应该包含以下内容:
如果 all_Data$Favorite 中的一行是 Apple 或 Blueberry,则 all_Data$Country = Ireand,all_Data$Continent= Europe 和 all_Data$city=Belfast
如果 all_Data$Favorite 中的一行是 Strawberry,那么 all_Data$Country= Holland,all_Data$Continent= Europe 和 all_Data$city=Emmen
如果 all_Data$Favorite 中的一行是 Lemon 或 Orange,则 all_Data$Country= France,all_Data$Continent= Europe 和 all_Data$city=Menton
如果 all_Data$Favorite 中的一行是沙拉或洋葱,那么 all_Data$Country= Sweeden,all_Data$Continent= Europe 和 all_Data$city=Malmoe
如果 all_Data$Favorite 中的一行是 Lemon 或 Orange,则 all_Data$Country= France,all_Data$Continent= Europe 和 all_Data$city=Menton
如果 all_Data$Favorite 中的一行是 Sweetpotatoes,那么 all_Data$Country= USA,all_Data$Continent= America 和 all_Data$city=Verona
如果 all_Data$Favorite 中的一行是芹菜,那么 all_Data$Country= 德国,all_Data$Continent= 欧洲和 all_Data$city=柏林
library(tidyverse)
all_Data |>
mutate(ctry_cont = case_when(
str_detect(Favorite, "Appl|Blueb") ~ "Ireland|Europe",
str_detect(Favorite, "Straw") ~ "Brazillian|South's of America",
str_detect(Favorite, "Lemon|Orang") ~ "France|Europe",
str_detect(Favorite, "Salad|Onion") ~ "Sweden|Europe",
str_detect(Favorite, "Sweetpot") ~ "United of state|America",
str_detect(Favorite, "Celery") ~ "Germany|Europe",
TRUE ~ "Other|Other"
)) |>
separate(ctry_cont, c("country", "continent"))
在 运行 上面的代码之后,我得到以下警告和数据,其中我们看到了英国和美国的一半价值。我还添加了带撇号的单词,因为在我的原始数据中有带撇号的单词,但它也不可见:
PersonID Favorite country continent
1 67 Apple Ireland Europe
2 82 Lemon France Europe
3 67 Orange France Europe
4 21 Salat Other Other
5 2 Onion Sweden Europe
6 12 Apple Ireland Europe
7 90 Strawberry Brazillian South
8 23 Celery Germany Europe
9 65 Blueberry Ireland Europe
10 32 Sweetpotatoes United of
11 44 Strawberry Brazillian South
12 67 Oragne Other Other
13 56 Celery Germany Europe
14 77 Sweetpotatoes United of
15 30 Onion Sweden Europe
16 198 Blueberry Ireland Europe
17 20 Strawberry Brazillian South
18 99 Salad Sweden Europe
Warning message:
Expected 2 pieces. Additional pieces discarded in 5 rows [7, 10, 11, 14, 17].
我还尝试在代码的最后一步添加 sep=""。它给出了一个错误。
separate(ctry_cont, c("country", "continent"), sep="")
你可以这样做...
Favorite <- c(
"Apple",
"Lemon",
"Orange",
"Salad",
"Onion",
"Apple",
"Strawberry",
"Celery",
"Blueberry",
"Sweetpotatoes",
"Strawberry",
"Orange",
"Celery",
"Sweetpotatoes",
"Onion",
"Blueberry",
"Strawberry",
"Salad"
)
PersonID <-
c(67, 82, 67, 21, 02, 12, 90, 23, 65, 32, 44, 67, 56, 77, 30, 198, 20, 99)
all_Data <- data.frame(PersonID, Favorite)
library(tidyverse)
all_Data |>
mutate(ctry_cont = case_when(
str_detect(Favorite, "Appl|Blueb") ~ "Ireland, Europe",
str_detect(Favorite, "Straw") ~ "Holland, Europe",
str_detect(Favorite, "Lemon|Orang") ~ "France, Europe",
str_detect(Favorite, "Salad|Onion") ~ "Sweden, Europe",
str_detect(Favorite, "Sweetpot") ~ "United States, North America",
str_detect(Favorite, "Celery") ~ "Germany, Europe",
TRUE ~ "Other, Other"
)) |>
separate(ctry_cont, c("country", "continent"), sep = ", ")
#> PersonID Favorite country continent
#> 1 67 Apple Ireland Europe
#> 2 82 Lemon France Europe
#> 3 67 Orange France Europe
#> 4 21 Salad Sweden Europe
#> 5 2 Onion Sweden Europe
#> 6 12 Apple Ireland Europe
#> 7 90 Strawberry Holland Europe
#> 8 23 Celery Germany Europe
#> 9 65 Blueberry Ireland Europe
#> 10 32 Sweetpotatoes United States North America
#> 11 44 Strawberry Holland Europe
#> 12 67 Orange France Europe
#> 13 56 Celery Germany Europe
#> 14 77 Sweetpotatoes United States North America
#> 15 30 Onion Sweden Europe
#> 16 198 Blueberry Ireland Europe
#> 17 20 Strawberry Holland Europe
#> 18 99 Salad Sweden Europe
由 reprex package (v2.0.1)
创建于 2022-04-22