如果满足标准并使用另一列中的值,则更改列中的值

Change value in a column if a criterion is meet and use value from another column

如果支出列中的值为零或负数,我想用平均支出列的值替换支出列中的值。

这是我的数据:

structure(list(Product = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019", 
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019", 
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0, 
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))

我试过以下方法:

df <- Data %>%
  mutate(Expenses = replace(Expenses,
                            Expenses<=0, `Average Expenses` 
  ))

它有效,但我收到一条警告消息: “警告信息: 在 x[列表] <- 值中: 要替换的项目数不是替换长度的倍数

我可以忽略它吗?或者为什么 R 会警告我?因为我看不到输出中有什么错误。

Data %>%
  mutate(Expenses = ifelse(Expenses <= 0, `Average Expenses`, Expenses))

您需要在 replace 语句中索引 return 变量,以告诉它您希望在满足条件时 return 编辑这些值。如果你忽略它,你会收到警告说它正在尝试为每个条目 return 更多值(而不是只有一个符合 Expenses <= 0.[=13= 标准的值) ]

replace(Expenses, Expenses<=0, `Average Expenses`[Expenses <= 0])

case_when 选项:

library(dplyr)
df %>%
  mutate(Expenses = case_when(Expenses < 0 ~ `Average Expenses`,
                              TRUE ~ Expenses))

输出:

  Product       Date Expenses Average Expenses
1       A 09.12.2019      0.2              0.2
2       A 10.12.2019      0.2              0.2
3       A 11.12.2019      0.3              0.2
4       B 09.12.2019      0.2              0.2
5       B 10.12.2019      0.0              0.2
6       B 11.12.2019      0.3              0.2
7       C 09.12.2019      0.0              0.2
8       C 10.12.2019      0.1              0.2
9       C 11.12.2019      0.1              0.2

data.table中:

library(data.table)
setDT(df)
df[, Expenses := ifelse(Expenses < 0, `Average Expenses`, Expenses)]

输出:

   Product       Date Expenses Average Expenses
1:       A 09.12.2019      0.2              0.2
2:       A 10.12.2019      0.2              0.2
3:       A 11.12.2019      0.3              0.2
4:       B 09.12.2019      0.2              0.2
5:       B 10.12.2019      0.0              0.2
6:       B 11.12.2019      0.3              0.2
7:       C 09.12.2019      0.0              0.2
8:       C 10.12.2019      0.1              0.2
9:       C 11.12.2019      0.1              0.2
df <- structure(list(Product = c("A", "A", "A", "B", "B", "B", "C", 
                           "C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019", 
                                               "09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019", 
                                               "11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0, 
                                                                           0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 
                                                                                                             0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))



library(data.table)
setDT(df)[Expenses <= 0, Expenses := `Average Expenses`][]
#>    Product       Date Expenses Average Expenses
#> 1:       A 09.12.2019      0.2              0.2
#> 2:       A 10.12.2019      0.2              0.2
#> 3:       A 11.12.2019      0.3              0.2
#> 4:       B 09.12.2019      0.2              0.2
#> 5:       B 10.12.2019      0.2              0.2
#> 6:       B 11.12.2019      0.3              0.2
#> 7:       C 09.12.2019      0.2              0.2
#> 8:       C 10.12.2019      0.1              0.2
#> 9:       C 11.12.2019      0.1              0.2

reprex package (v2.0.1)

创建于 2022-05-27