如果满足标准并使用另一列中的值,则更改列中的值
Change value in a column if a criterion is meet and use value from another column
如果支出列中的值为零或负数,我想用平均支出列的值替换支出列中的值。
这是我的数据:
structure(list(Product = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019",
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019",
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0,
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))
我试过以下方法:
df <- Data %>%
mutate(Expenses = replace(Expenses,
Expenses<=0, `Average Expenses`
))
它有效,但我收到一条警告消息:
“警告信息:
在 x[列表] <- 值中:
要替换的项目数不是替换长度的倍数
我可以忽略它吗?或者为什么 R 会警告我?因为我看不到输出中有什么错误。
Data %>%
mutate(Expenses = ifelse(Expenses <= 0, `Average Expenses`, Expenses))
您需要在 replace
语句中索引 return 变量,以告诉它您希望在满足条件时 return 编辑这些值。如果你忽略它,你会收到警告说它正在尝试为每个条目 return 更多值(而不是只有一个符合 Expenses <= 0
.[=13= 标准的值) ]
replace(Expenses, Expenses<=0, `Average Expenses`[Expenses <= 0])
case_when
选项:
library(dplyr)
df %>%
mutate(Expenses = case_when(Expenses < 0 ~ `Average Expenses`,
TRUE ~ Expenses))
输出:
Product Date Expenses Average Expenses
1 A 09.12.2019 0.2 0.2
2 A 10.12.2019 0.2 0.2
3 A 11.12.2019 0.3 0.2
4 B 09.12.2019 0.2 0.2
5 B 10.12.2019 0.0 0.2
6 B 11.12.2019 0.3 0.2
7 C 09.12.2019 0.0 0.2
8 C 10.12.2019 0.1 0.2
9 C 11.12.2019 0.1 0.2
在data.table
中:
library(data.table)
setDT(df)
df[, Expenses := ifelse(Expenses < 0, `Average Expenses`, Expenses)]
输出:
Product Date Expenses Average Expenses
1: A 09.12.2019 0.2 0.2
2: A 10.12.2019 0.2 0.2
3: A 11.12.2019 0.3 0.2
4: B 09.12.2019 0.2 0.2
5: B 10.12.2019 0.0 0.2
6: B 11.12.2019 0.3 0.2
7: C 09.12.2019 0.0 0.2
8: C 10.12.2019 0.1 0.2
9: C 11.12.2019 0.1 0.2
df <- structure(list(Product = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019",
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019",
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0,
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))
library(data.table)
setDT(df)[Expenses <= 0, Expenses := `Average Expenses`][]
#> Product Date Expenses Average Expenses
#> 1: A 09.12.2019 0.2 0.2
#> 2: A 10.12.2019 0.2 0.2
#> 3: A 11.12.2019 0.3 0.2
#> 4: B 09.12.2019 0.2 0.2
#> 5: B 10.12.2019 0.2 0.2
#> 6: B 11.12.2019 0.3 0.2
#> 7: C 09.12.2019 0.2 0.2
#> 8: C 10.12.2019 0.1 0.2
#> 9: C 11.12.2019 0.1 0.2
由 reprex package (v2.0.1)
创建于 2022-05-27
如果支出列中的值为零或负数,我想用平均支出列的值替换支出列中的值。
这是我的数据:
structure(list(Product = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019",
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019",
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0,
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))
我试过以下方法:
df <- Data %>%
mutate(Expenses = replace(Expenses,
Expenses<=0, `Average Expenses`
))
它有效,但我收到一条警告消息: “警告信息: 在 x[列表] <- 值中: 要替换的项目数不是替换长度的倍数
我可以忽略它吗?或者为什么 R 会警告我?因为我看不到输出中有什么错误。
Data %>%
mutate(Expenses = ifelse(Expenses <= 0, `Average Expenses`, Expenses))
您需要在 replace
语句中索引 return 变量,以告诉它您希望在满足条件时 return 编辑这些值。如果你忽略它,你会收到警告说它正在尝试为每个条目 return 更多值(而不是只有一个符合 Expenses <= 0
.[=13= 标准的值) ]
replace(Expenses, Expenses<=0, `Average Expenses`[Expenses <= 0])
case_when
选项:
library(dplyr)
df %>%
mutate(Expenses = case_when(Expenses < 0 ~ `Average Expenses`,
TRUE ~ Expenses))
输出:
Product Date Expenses Average Expenses
1 A 09.12.2019 0.2 0.2
2 A 10.12.2019 0.2 0.2
3 A 11.12.2019 0.3 0.2
4 B 09.12.2019 0.2 0.2
5 B 10.12.2019 0.0 0.2
6 B 11.12.2019 0.3 0.2
7 C 09.12.2019 0.0 0.2
8 C 10.12.2019 0.1 0.2
9 C 11.12.2019 0.1 0.2
在data.table
中:
library(data.table)
setDT(df)
df[, Expenses := ifelse(Expenses < 0, `Average Expenses`, Expenses)]
输出:
Product Date Expenses Average Expenses
1: A 09.12.2019 0.2 0.2
2: A 10.12.2019 0.2 0.2
3: A 11.12.2019 0.3 0.2
4: B 09.12.2019 0.2 0.2
5: B 10.12.2019 0.0 0.2
6: B 11.12.2019 0.3 0.2
7: C 09.12.2019 0.0 0.2
8: C 10.12.2019 0.1 0.2
9: C 11.12.2019 0.1 0.2
df <- structure(list(Product = c("A", "A", "A", "B", "B", "B", "C",
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019",
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019",
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0,
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))
library(data.table)
setDT(df)[Expenses <= 0, Expenses := `Average Expenses`][]
#> Product Date Expenses Average Expenses
#> 1: A 09.12.2019 0.2 0.2
#> 2: A 10.12.2019 0.2 0.2
#> 3: A 11.12.2019 0.3 0.2
#> 4: B 09.12.2019 0.2 0.2
#> 5: B 10.12.2019 0.2 0.2
#> 6: B 11.12.2019 0.3 0.2
#> 7: C 09.12.2019 0.2 0.2
#> 8: C 10.12.2019 0.1 0.2
#> 9: C 11.12.2019 0.1 0.2
由 reprex package (v2.0.1)
创建于 2022-05-27