如何迭代 R 中的行并应用 if 条件
How to iterate rows in R and apply if conditions
我有一个整洁的数据集,其中一些列如下所示:
my_col_a my_col_b my_col_c
(chr) (chr) (chr)
1 happy sad -
2 sad - defiant
3 happy - -
4 - - -
5 excited - -
如何计算 R 中每行有多少个与“-”不同的单词?所需的输出将是:
my_col_a my_col_b my_col_c nmr_moods
(chr) (chr) (chr) (double)
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
基础
df <- structure(list(my_col_a = c("happy", "sad", "happy", "-", "excited"
), my_col_b = c("sad", "-", "-", "-", "-"), my_col_c = c("-",
"defiant", "-", "-", "-")), class = "data.frame", row.names = c(NA,
-5L))
df$nmr_moods <- apply(df, 1, function(x) sum(!grepl("^-$", x)))
df
#> my_col_a my_col_b my_col_c nmr_moods
#> 1 happy sad - 2
#> 2 sad - defiant 2
#> 3 happy - - 1
#> 4 - - - 0
#> 5 excited - - 1
或
df$nmr_moods <- rowSums(df != "-")
my_col_a my_col_b my_col_c nmr_moods
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
tidyverse
由 reprex package (v2.0.0)
于 2021-04-27 创建
library(tidyverse)
df %>%
mutate(nmr_moods = rowSums(across(everything(), ~!grepl("^-$", .x))))
#> my_col_a my_col_b my_col_c nmr_moods
#> 1 happy sad - 2
#> 2 sad - defiant 2
#> 3 happy - - 1
#> 4 - - - 0
#> 5 excited - - 1
由 reprex package (v2.0.0)
于 2021-04-27 创建
使用dplyr
:
library(dplyr)
df <- data.frame(
my_col_a = c("happy", "sad", "happy", "-", "excited"),
my_col_b = c("sad", rep("-", 4)),
my_col_c = c("-", "defiant", rep("-", 3)))
df %>%
rowwise() %>%
mutate(my_col_d = sum(c_across(my_col_a:my_col_c) != "-")) %>%
ungroup()
# A tibble: 5 x 4
my_col_a my_col_b my_col_c my_col_d
<chr> <chr> <chr> <int>
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
这也是针对您的目的的另一个 tidyverse 解决方案:
library(stringr)
library(purrr)
df %>%
mutate(my_count = pmap_dbl(.,
~ sum(str_detect(c(...), "-", negate = TRUE))))
my_col_a my_col_b my_col_c my_count
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
具有apply
和str_count
的单行解决方案:
library(stringr)
df$count <- apply(df, 1, function(x) sum(str_count(x, "^[A-Za-z]+$")))
结果:
df
my_col_a my_col_b my_col_c count
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
数据:@Claudio 的
我有一个整洁的数据集,其中一些列如下所示:
my_col_a my_col_b my_col_c
(chr) (chr) (chr)
1 happy sad -
2 sad - defiant
3 happy - -
4 - - -
5 excited - -
如何计算 R 中每行有多少个与“-”不同的单词?所需的输出将是:
my_col_a my_col_b my_col_c nmr_moods
(chr) (chr) (chr) (double)
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
基础
df <- structure(list(my_col_a = c("happy", "sad", "happy", "-", "excited"
), my_col_b = c("sad", "-", "-", "-", "-"), my_col_c = c("-",
"defiant", "-", "-", "-")), class = "data.frame", row.names = c(NA,
-5L))
df$nmr_moods <- apply(df, 1, function(x) sum(!grepl("^-$", x)))
df
#> my_col_a my_col_b my_col_c nmr_moods
#> 1 happy sad - 2
#> 2 sad - defiant 2
#> 3 happy - - 1
#> 4 - - - 0
#> 5 excited - - 1
或
df$nmr_moods <- rowSums(df != "-")
my_col_a my_col_b my_col_c nmr_moods
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
tidyverse
由 reprex package (v2.0.0)
于 2021-04-27 创建library(tidyverse)
df %>%
mutate(nmr_moods = rowSums(across(everything(), ~!grepl("^-$", .x))))
#> my_col_a my_col_b my_col_c nmr_moods
#> 1 happy sad - 2
#> 2 sad - defiant 2
#> 3 happy - - 1
#> 4 - - - 0
#> 5 excited - - 1
由 reprex package (v2.0.0)
于 2021-04-27 创建使用dplyr
:
library(dplyr)
df <- data.frame(
my_col_a = c("happy", "sad", "happy", "-", "excited"),
my_col_b = c("sad", rep("-", 4)),
my_col_c = c("-", "defiant", rep("-", 3)))
df %>%
rowwise() %>%
mutate(my_col_d = sum(c_across(my_col_a:my_col_c) != "-")) %>%
ungroup()
# A tibble: 5 x 4
my_col_a my_col_b my_col_c my_col_d
<chr> <chr> <chr> <int>
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
这也是针对您的目的的另一个 tidyverse 解决方案:
library(stringr)
library(purrr)
df %>%
mutate(my_count = pmap_dbl(.,
~ sum(str_detect(c(...), "-", negate = TRUE))))
my_col_a my_col_b my_col_c my_count
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
具有apply
和str_count
的单行解决方案:
library(stringr)
df$count <- apply(df, 1, function(x) sum(str_count(x, "^[A-Za-z]+$")))
结果:
df
my_col_a my_col_b my_col_c count
1 happy sad - 2
2 sad - defiant 2
3 happy - - 1
4 - - - 0
5 excited - - 1
数据:@Claudio 的