聚合 table 但没有列是因子时出现因子错误
Getting a factor error when aggregating a table but no columns are factors
所以我的数据是从 csv 文件上传的。我尝试使用 stringsAsFactors = FALSE
上传它,但我仍然遇到错误。前 13 列是字符串,其余列(从 14 列开始)都是数字。这是核心代码:
library("readxl")
# Read data with facotr is False
data <- read.csv("PFR csvData.csv",stringsAsFactors = FALSE)
# Convert all numeric rows to numeric
data[,14:length(colnames(data))]<- as.numeric(as.character(unlist(data[,14:length(colnames(data))])))
# Convert all string rows to characters
data[,1:13]<- as.character(unlist(data[,1:13]))
当我通过 sapply(data, class)
检查每一列的 class 时,我得到:
Rk Player Pos Age Date Lg Tm
"character" "character" "character" "character" "character" "character" "character"
H.A Opp Result G. Week Day Receiving_Tgt
"character" "character" "character" "character" "character" "character" "numeric"
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
"numeric" "numeric" "numeric" "numeric"
我还通过apply(data, 2, function(x) any(is.na(x)))
检查了NA并获得:
Rk Player Pos Age Date Lg Tm
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
H.A Opp Result G. Week Day Receiving_Tgt
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
FALSE FALSE FALSE FALSE
所以在这一点上,我想我上传的数据没有因素,通过强制类型确保所有列都不是因素,并通过查看每列的 class 仔细检查。我还确保没有 NA
但是,当我使用我的聚合函数时,我收到了与因子相关的错误:
aggregate(data$Player, by = list(data$Total_Half_PPR), FUN = sum)
Error in Summary.factor(291L, na.rm = FALSE) :
‘sum’ not meaningful for factors
我不知道还能做什么。感谢您的帮助!
'Player' 是 factor
。我们需要转换为 numeric
data$Player <- as.numeric(as.character(data$Player))
如果我们需要获取 'Total_Half_PPR' 的 sum
,请换一种方式
aggregate(data$Total_Half_PPR, by = list(data$Player), FUN = sum)
或使用公式方法
aggregate(Total_Half_PPR ~ Player, data, FUN = sum)
所以我的数据是从 csv 文件上传的。我尝试使用 stringsAsFactors = FALSE
上传它,但我仍然遇到错误。前 13 列是字符串,其余列(从 14 列开始)都是数字。这是核心代码:
library("readxl")
# Read data with facotr is False
data <- read.csv("PFR csvData.csv",stringsAsFactors = FALSE)
# Convert all numeric rows to numeric
data[,14:length(colnames(data))]<- as.numeric(as.character(unlist(data[,14:length(colnames(data))])))
# Convert all string rows to characters
data[,1:13]<- as.character(unlist(data[,1:13]))
当我通过 sapply(data, class)
检查每一列的 class 时,我得到:
Rk Player Pos Age Date Lg Tm
"character" "character" "character" "character" "character" "character" "character"
H.A Opp Result G. Week Day Receiving_Tgt
"character" "character" "character" "character" "character" "character" "numeric"
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
"numeric" "numeric" "numeric" "numeric"
我还通过apply(data, 2, function(x) any(is.na(x)))
检查了NA并获得:
Rk Player Pos Age Date Lg Tm
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
H.A Opp Result G. Week Day Receiving_Tgt
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Receiving_Rec Receiving_Yds Receiving_Y.R Receiving_TD Receiving_Ctch. Receiving_Y.Tgt Receiving_PPR
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Cmp Passing_Att Passing_Cmp. Passing_Yds Passing_TD Passing_Int Passing_Rate
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Passing_Sk Passing_Sk_Yds Passing_Y.A Passing_AY.A Passing_PPR Rushing_Att Rushing_Yds
FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Rushing_Y.A Rushing_TD Rushing_Half_PPR Total_Half_PPR
FALSE FALSE FALSE FALSE
所以在这一点上,我想我上传的数据没有因素,通过强制类型确保所有列都不是因素,并通过查看每列的 class 仔细检查。我还确保没有 NA
但是,当我使用我的聚合函数时,我收到了与因子相关的错误:
aggregate(data$Player, by = list(data$Total_Half_PPR), FUN = sum)
Error in Summary.factor(291L, na.rm = FALSE) :
‘sum’ not meaningful for factors
我不知道还能做什么。感谢您的帮助!
'Player' 是 factor
。我们需要转换为 numeric
data$Player <- as.numeric(as.character(data$Player))
如果我们需要获取 'Total_Half_PPR' 的 sum
,请换一种方式
aggregate(data$Total_Half_PPR, by = list(data$Player), FUN = sum)
或使用公式方法
aggregate(Total_Half_PPR ~ Player, data, FUN = sum)