聚合 table 但没有列是因子时出现因子错误

Getting a factor error when aggregating a table but no columns are factors

所以我的数据是从 csv 文件上传的。我尝试使用 stringsAsFactors = FALSE 上传它,但我仍然遇到错误。前 13 列是字符串,其余列(从 14 列开始)都是数字。这是核心代码:

library("readxl")

# Read data with facotr is False 
data <- read.csv("PFR csvData.csv",stringsAsFactors = FALSE)

# Convert all numeric rows to numeric
data[,14:length(colnames(data))]<- as.numeric(as.character(unlist(data[,14:length(colnames(data))])))

# Convert all string rows to characters
data[,1:13]<- as.character(unlist(data[,1:13]))

当我通过 sapply(data, class) 检查每一列的 class 时,我得到:

           Rk           Player              Pos              Age             Date               Lg               Tm 
     "character"      "character"      "character"      "character"      "character"      "character"      "character" 
             H.A              Opp           Result               G.             Week              Day    Receiving_Tgt 
     "character"      "character"      "character"      "character"      "character"      "character"        "numeric" 
   Receiving_Rec    Receiving_Yds    Receiving_Y.R     Receiving_TD  Receiving_Ctch.  Receiving_Y.Tgt    Receiving_PPR 
       "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric" 
     Passing_Cmp      Passing_Att     Passing_Cmp.      Passing_Yds       Passing_TD      Passing_Int     Passing_Rate 
       "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric" 
      Passing_Sk   Passing_Sk_Yds      Passing_Y.A     Passing_AY.A      Passing_PPR      Rushing_Att      Rushing_Yds 
       "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric"        "numeric" 
     Rushing_Y.A       Rushing_TD Rushing_Half_PPR   Total_Half_PPR 
       "numeric"        "numeric"        "numeric"        "numeric" 

我还通过apply(data, 2, function(x) any(is.na(x)))检查了NA并获得:

              Rk           Player              Pos              Age             Date               Lg               Tm 
           FALSE            FALSE            FALSE            FALSE            FALSE            FALSE            FALSE 
             H.A              Opp           Result               G.             Week              Day    Receiving_Tgt 
           FALSE            FALSE            FALSE            FALSE            FALSE            FALSE            FALSE 
   Receiving_Rec    Receiving_Yds    Receiving_Y.R     Receiving_TD  Receiving_Ctch.  Receiving_Y.Tgt    Receiving_PPR 
           FALSE            FALSE            FALSE            FALSE            FALSE            FALSE            FALSE 
     Passing_Cmp      Passing_Att     Passing_Cmp.      Passing_Yds       Passing_TD      Passing_Int     Passing_Rate 
           FALSE            FALSE            FALSE            FALSE            FALSE            FALSE            FALSE 
      Passing_Sk   Passing_Sk_Yds      Passing_Y.A     Passing_AY.A      Passing_PPR      Rushing_Att      Rushing_Yds 
           FALSE            FALSE            FALSE            FALSE            FALSE            FALSE            FALSE 
     Rushing_Y.A       Rushing_TD Rushing_Half_PPR   Total_Half_PPR 
           FALSE            FALSE            FALSE            FALSE 

所以在这一点上,我想我上传的数据没有因素,通过强制类型确保所有列都不是因素,并通过查看每列的 class 仔细检查。我还确保没有 NA

但是,当我使用我的聚合函数时,我收到了与因子相关的错误:

aggregate(data$Player, by = list(data$Total_Half_PPR), FUN = sum)
Error in Summary.factor(291L, na.rm = FALSE) : 
  ‘sum’ not meaningful for factors

我不知道还能做什么。感谢您的帮助!

'Player' 是 factor。我们需要转换为 numeric

data$Player <- as.numeric(as.character(data$Player))

如果我们需要获取 'Total_Half_PPR' 的 sum,请换一种方式

aggregate(data$Total_Half_PPR, by = list(data$Player), FUN = sum)

或使用公式方法

aggregate(Total_Half_PPR ~ Player, data, FUN = sum)