如何迭代 R 中的行并应用 if 条件

Question

我有一个整洁的数据集，其中一些列如下所示：

  my_col_a      my_col_b     my_col_c 
    (chr)        (chr)       (chr) 
1   happy         sad          -     
2    sad           -        defiant    
3   happy          -           -     
4     -            -           -     
5   excited        -           -

如何计算 R 中每行有多少个与“-”不同的单词？所需的输出将是：

 my_col_a      my_col_b     my_col_c  nmr_moods
    (chr)        (chr)       (chr)    (double)
1   happy         sad          -         2
2    sad           -        defiant      2
3   happy          -           -         1
4     -            -           -         0
5   excited        -           -         1

Answer 1

基础

df <- structure(list(my_col_a = c("happy", "sad", "happy", "-", "excited"
), my_col_b = c("sad", "-", "-", "-", "-"), my_col_c = c("-", 
                                                         "defiant", "-", "-", "-")), class = "data.frame", row.names = c(NA, 
                                                                                                                         -5L))



df$nmr_moods <- apply(df, 1, function(x) sum(!grepl("^-$", x)))
df
#>   my_col_a my_col_b my_col_c nmr_moods
#> 1    happy      sad        -         2
#> 2      sad        -  defiant         2
#> 3    happy        -        -         1
#> 4        -        -        -         0
#> 5  excited        -        -         1

或

df$nmr_moods <- rowSums(df != "-")

  my_col_a my_col_b my_col_c nmr_moods
1    happy      sad        -         2
2      sad        -  defiant         2
3    happy        -        -         1
4        -        -        -         0
5  excited        -        -         1

tidyverse

^{由 reprex package (v2.0.0)}

于 2021-04-27 创建

library(tidyverse)
df %>% 
  mutate(nmr_moods = rowSums(across(everything(), ~!grepl("^-$", .x))))
#>   my_col_a my_col_b my_col_c nmr_moods
#> 1    happy      sad        -         2
#> 2      sad        -  defiant         2
#> 3    happy        -        -         1
#> 4        -        -        -         0
#> 5  excited        -        -         1

^{由 reprex package (v2.0.0)}

于 2021-04-27 创建

Answer 2

使用dplyr:

library(dplyr)

df <- data.frame(
  my_col_a = c("happy", "sad", "happy", "-", "excited"),
  my_col_b = c("sad", rep("-", 4)),
  my_col_c = c("-", "defiant", rep("-", 3)))

df %>%
  rowwise() %>%
  mutate(my_col_d = sum(c_across(my_col_a:my_col_c) != "-")) %>%
  ungroup()

# A tibble: 5 x 4
  my_col_a my_col_b my_col_c my_col_d
  <chr>    <chr>    <chr>       <int>
1 happy    sad      -               2
2 sad      -        defiant         2
3 happy    -        -               1
4 -        -        -               0
5 excited  -        -               1

Answer 3

这也是针对您的目的的另一个 tidyverse 解决方案：

library(stringr)
library(purrr)

df %>%
  mutate(my_count = pmap_dbl(., 
                             ~ sum(str_detect(c(...), "-", negate = TRUE))))


  my_col_a my_col_b my_col_c my_count
1    happy      sad        -        2
2      sad        -  defiant        2
3    happy        -        -        1
4        -        -        -        0
5  excited        -        -        1

Answer 4

具有apply和str_count的单行解决方案：

library(stringr)
df$count <- apply(df, 1, function(x) sum(str_count(x, "^[A-Za-z]+$")))

结果：

df
  my_col_a my_col_b my_col_c count
1    happy      sad        -     2
2      sad        -  defiant     2
3    happy        -        -     1
4        -        -        -     0
5  excited        -        -     1

数据：@Claudio 的

如何迭代 R 中的行并应用 if 条件

How to iterate rows in R and apply if conditions

r

tidy

dplyr