创建以先前行为条件的新行

Create new rows that are conditional on earlier rows

我想创建 output_df,在底部添加两行。

新行是:

original_df 中,百分比不等于 1(例如 sum(original_df$x == "Y" & original_df$y == "A") != 1),这就是我要修复它的原因。

通过添加额外的行(例如,output_df[5,]),然后总和为 1(例如,sum(output_df$x == "Y" & output_df$y == "A") == 1)。

original_df <- data.frame(x = c("Y","N","Y","N"),
                          y = c("A","B","A","B"),
                          z = c("a","b","c","d"),
                          percentage = c(0.1, 0.2, 0.5, 0.5)) 

output_df <- data.frame(x = c("Y","N","Y","N","Y","N"),
                        y = c("A","B","A","B","A","B"),
                        z = c("a","b","c","d",NA , NA),
                        percentage = c(0.1, 0.2, 0.5, 0.5, 0.4, 0.3)) 

对于上下文,我的意图是使用 output_df 来输入 sample 函数,其中 replace = T.

以下是使用 dplyr

的方法
library(dplyr)

original_df %>%
  group_by(x, y) %>%
  do(add_row(., x = .$x[1], y = .$y[1])) %>%
  mutate(across(percentage, ~ifelse(is.na(.), 1 - sum(., na.rm = T), .))) %>%
  ungroup()
  
#> # A tibble: 6 × 4
#>   x     y     z     percentage
#>   <chr> <chr> <chr>      <dbl>
#> 1 N     B     b            0.2
#> 2 N     B     d            0.5
#> 3 N     B     <NA>         0.3
#> 4 Y     A     a            0.1
#> 5 Y     A     c            0.5
#> 6 Y     A     <NA>         0.4