根据我在 r 中的数据集中的两个变量创建新变量
Create new variable based on two variables in my dataset in r
我想在我的数据集中创建一个列,它是从我的 total
列中减去 positive
和 negative
情绪。
因此,对于用户 Alex
,其正面情绪总和为 80
,negative
情绪总和为 13
,减去的分数将为 67。
我遇到的问题是以允许我执行此 操作.
的方式对情绪列进行分组
library(tidyverse)
# create mock dataframe
users <- c("Alex", "Alice", "Alexandra", "Andrew", "Alicia", "Alex", "Alice", "Alexandra", "Andrew", "Alicia")
sentiment <- c("positive", "negative", "positive","negative", "positive", "negative", "positive", "negative","positive", "negative")
total <- c(80, 70, 24, 74, 66, 13, 35, 94, 27, 94)
mockdataframe <- cbind(users,sentiment, total) %>% as_tibble()
mockdataframe$sentiment <- as.factor(mockdataframe$sentiment)
mockdataframe$total <- as.numeric(mockdataframe$total)
# using case_when() this way does not work
mockdataframe %>%
mutate(Subtraction = case_when(
sentiment == "positive" ~ (sentiment == "negative")/mockdataframe$total))
我真的很难解决这个问题。任何帮助将不胜感激。
使用 tidyr::pivot_wider
你可以:
library(tidyverse)
mockdataframe %>%
pivot_wider(names_from = sentiment, values_from = total) %>%
mutate(Subtraction = positive - negative)
#> # A tibble: 5 × 4
#> users positive negative Subtraction
#> <chr> <dbl> <dbl> <dbl>
#> 1 Alex 80 13 67
#> 2 Alice 35 70 -35
#> 3 Alexandra 24 94 -70
#> 4 Andrew 27 74 -47
#> 5 Alicia 66 94 -28
或使用group_by
:
mockdataframe %>%
group_by(users) %>%
mutate(Subtraction = total[sentiment == "positive"] - total[sentiment == "negative"]) |>
ungroup()
#> # A tibble: 10 × 4
#> users sentiment total Subtraction
#> <chr> <fct> <dbl> <dbl>
#> 1 Alex positive 80 67
#> 2 Alice negative 70 -35
#> 3 Alexandra positive 24 -70
#> 4 Andrew negative 74 -47
#> 5 Alicia positive 66 -28
#> 6 Alex negative 13 67
#> 7 Alice positive 35 -35
#> 8 Alexandra negative 94 -70
#> 9 Andrew positive 27 -47
#> 10 Alicia negative 94 -28
我想在我的数据集中创建一个列,它是从我的 total
列中减去 positive
和 negative
情绪。
因此,对于用户 Alex
,其正面情绪总和为 80
,negative
情绪总和为 13
,减去的分数将为 67。
我遇到的问题是以允许我执行此 操作.
的方式对情绪列进行分组library(tidyverse)
# create mock dataframe
users <- c("Alex", "Alice", "Alexandra", "Andrew", "Alicia", "Alex", "Alice", "Alexandra", "Andrew", "Alicia")
sentiment <- c("positive", "negative", "positive","negative", "positive", "negative", "positive", "negative","positive", "negative")
total <- c(80, 70, 24, 74, 66, 13, 35, 94, 27, 94)
mockdataframe <- cbind(users,sentiment, total) %>% as_tibble()
mockdataframe$sentiment <- as.factor(mockdataframe$sentiment)
mockdataframe$total <- as.numeric(mockdataframe$total)
# using case_when() this way does not work
mockdataframe %>%
mutate(Subtraction = case_when(
sentiment == "positive" ~ (sentiment == "negative")/mockdataframe$total))
我真的很难解决这个问题。任何帮助将不胜感激。
使用 tidyr::pivot_wider
你可以:
library(tidyverse)
mockdataframe %>%
pivot_wider(names_from = sentiment, values_from = total) %>%
mutate(Subtraction = positive - negative)
#> # A tibble: 5 × 4
#> users positive negative Subtraction
#> <chr> <dbl> <dbl> <dbl>
#> 1 Alex 80 13 67
#> 2 Alice 35 70 -35
#> 3 Alexandra 24 94 -70
#> 4 Andrew 27 74 -47
#> 5 Alicia 66 94 -28
或使用group_by
:
mockdataframe %>%
group_by(users) %>%
mutate(Subtraction = total[sentiment == "positive"] - total[sentiment == "negative"]) |>
ungroup()
#> # A tibble: 10 × 4
#> users sentiment total Subtraction
#> <chr> <fct> <dbl> <dbl>
#> 1 Alex positive 80 67
#> 2 Alice negative 70 -35
#> 3 Alexandra positive 24 -70
#> 4 Andrew negative 74 -47
#> 5 Alicia positive 66 -28
#> 6 Alex negative 13 67
#> 7 Alice positive 35 -35
#> 8 Alexandra negative 94 -70
#> 9 Andrew positive 27 -47
#> 10 Alicia negative 94 -28