我的功能有什么问题？使用匹配 ID 计算百分比

Question

这就是我要构建的功能。它应该根据相同的 ID 计算一个 Df 的 col 在另一个 Df 上的百分比。

# With dummy data 

df1 = data.frame(State = c('Arizona AZ','Georgia GG', 'Newyork NY','Indiana IN','Florida FL'), Score=c(62,47,55,74,31), id=c(1,2,3,4,5))
df1

> df1
       State Score id
1 Arizona AZ    62  1
2 Georgia GG    47  2
3 Newyork NY    55  3
4 Indiana IN    74  4
5 Florida FL    31  5

df2 = data.frame(State = c('Arizona AZ','Georgia GG', 'Newyork NY','Indiana IN'), Score2=c(10,7,5,4), id=c(1,2,3,4))
df2

> df2
       State Score2 id
1 Arizona AZ     10  1
2 Georgia GG      7  2
3 Newyork NY      5  3
4 Indiana IN      4  4

CalcPerc <- function(x, ins) {
  
  # 1) Subset + cbind
  y  <- subset(ins, id %in% x$id)
  y  <- cbind(y, x$Score)
  
  # Percentage
  x1 <- 100*(y$Score2/y$Score)
  
  print(x1)
}

CalcPerc(x= df2, ins = df1)

[1] 4
numeric(0)

为什么我得到 numeric(0)？

如何修复我的功能？

如果我在函数外执行它，它工作得很好。

感谢您的帮助！

Answer 1

尝试在 print(x1) 和运行 CalcPerc(x= df2, ins = df1) 之前添加 browser() 语句。你会看到 y 是

       State Score id x$Score
1 Arizona AZ    62  1      10
2 Georgia GG    47  2       7
3 Newyork NY    55  3       5
4 Indiana IN    74  4       4

这就是为什么引用 y$Score 会给出一个空向量——没有这样的列。我怀疑你真正想要的是合并这两个数据帧。以 R 为基数：

CalcPerc <- function(x, ins) {
    
    # 1) Subset + cbind
    y  <- subset(ins, id %in% x$id)
    
    z <- merge(x, y, by = c('State', 'id'))
    
    x1 <- 100*(z$Score2/z$Score)
    
    print(x1)
}

Answer 2

试试这个：

CalcPerc <- function(x, ins) {
      # 1) Subset + cbind
      y  <- subset(ins, id %in% x$id)
      y$Score2 = x$Score2
      x1 <- 100*(y$Score2/y$Score)
      print(x1)
   }
   > CalcPerc(x= df2, ins = df1)
   [1] 16.129032 14.893617  9.090909  5.405405

答案将按正确顺序排列

Answer 3

@robertdj 和@Necklondon 修正了你的错误。如果你想要一个 dplyr 选项，你可以根据 id 和状态加入你的数据，mutate 一个计算百分比的列，所以你立即看到百分比对应于数据帧中的状态：

library(dplyr)
df1 %>%
  left_join(df2, by = c("id", "State")) %>%
  mutate(Perc = 100*(Score2/Score))

输出：

       State Score id Score2      Perc
1 Arizona AZ    62  1     10 16.129032
2 Georgia GG    47  2      7 14.893617
3 Newyork NY    55  3      5  9.090909
4 Indiana IN    74  4      4  5.405405
5 Florida FL    31  5     NA        NA

我的功能有什么问题？使用匹配 ID 计算百分比

Whats wrong with my function? Calculation of Percentage with matching IDs

statistics

r

function

dataframe