循环遍历一个向量，创建一个新变量，它是第一个向量减去 2 个其他向量，其中 none 个是 NA

Question

假设以下数据集：

Company  Sales  COGS  Staff
A        100      50     25
B        200      NA    100
C         NA      50     25
D         75      50     25
E        125     100     NA

我想创建一个名为 profit 的新变量，它是 Sales- COGS -Staff，if 这些变量都不是 NA .所需的输出如下：

Company  Sales  COGS  Staff  Profit
A        100      50     25      25
B        200      NA    100      NA
C         NA      50     25      NA
D         75      50     25       0
E        125     100     NA      NA

我从类似的东西开始：

# Creating the profit column (should be unnecessary right?)
df$Profit <- NA
# For each row in the sales column/vector
for(i in df$Sales){
# If all are not NA
if(!is.na(df$Sales) & !is.na(df$COGS) & !is.na(df$Staff)){
# Do calculation for profit
df$Profit <- df$Sales - (df$COGS + df$Staff)
# If calculation not possible
} else {
df$Profit <- NA
}}

这并没有给出错误，但它让 R 变得有点失控。有没有更有效的方法来做到这一点？

Answer 1

我们用rowSums创建一个逻辑索引来检查所选列数据集的某一行中是否有任何NA，如果没有，则对列进行减法并将其分配给'Profit'

i1 <- !rowSums(is.na(df1[-1]))
df1$Profit[i1] <- with(df1, (Sales-COGS-Staff)[i1])
df1
#  Company Sales COGS Staff Profit
#1       A   100   50    25     25
#2       B   200   NA   100     NA
#3       C    NA   50    25     NA
#4       D    75   50    25      0
#5       E   125  100    NA     NA

注意：这是排除 NA 行的通用方法，因此我们只计算行的子集而不是整个数据集

但是，任何值减去 NA returns NA，所以使用

df1$Profit <- with(df1, (Sales - COGS - Staff))

应该也可以

或者如果有很多列，另一种选择，

rowSums(df1[-1] * c(1, -1, -1)[col(df1[-1])])

Answer 2

就像你看到的一样简单...

df$Sales-df$COGS-df$Staff
[1] 25 NA NA  0 NA

如果 COGS 和 Staff 中有任何 NA，结果将变为 NA ，就像你做 sum 时，有 na.rm ，简单操作标记默认为 na.rm = False

Answer 3

这似乎是 within 的工作。

df <- within(df, Profit <- Sales - COGS - Staff)

df
#  Company Sales COGS Staff Profit
#1       A   100   50    25     25
#2       B   200   NA   100     NA
#3       C    NA   50    25     NA
#4       D    75   50    25      0
#5       E   125  100    NA     NA

数据。

df <- read.table(text = "
Company  Sales  COGS  Staff
A        100      50     25
B        200      NA    100
C         NA      50     25
D         75      50     25
E        125     100     NA
", header = TRUE)

循环遍历一个向量，创建一个新变量，它是第一个向量减去 2 个其他向量，其中 none 个是 NA

Looping through a vector, creating a new variable which is the first vector minus 2 other vectors when none of them are NA

for-loop

if-statement

r

na