如何从 R 中的线性模型中提取特定残差数据
How to extract specific residual data from a linear model in R
如何在以下线性模型中提取特定棒球队的残差数据?例如,我将如何提取 "CLE" 的残差?
library(Lahman)
library(dplyr)
library(broom)
# create baseball team data
data(Teams)
teams <- Teams
teams <- teams %>% mutate(win_percentage = (W / (W + L)) * 100)
# summarize baseball team salary by year
salaries <- Salaries
salaries <- salaries %>%
group_by(teamID, yearID, lgID) %>%
summarise(payroll_M = sum(as.numeric(salary)) / 10^6) %>%
ungroup()
# add winning percentage to the salary table
salaries <- teams %>%
select(yearID, teamID, win_percentage) %>%
right_join(salaries, by = c("yearID", "teamID"))
# compute linear model of winning vs team salary
model <- salaries %>%
group_by(yearID) %>%
do(fit = augment(lm(win_percentage ~ payroll_M, data = .)))
# extract residuals for Cleveland ??????
你很接近,但需要对 augment
行进行两处更改。
您正在将生成的(增强的)数据框保存到名为 fit
的列中。相反,请尝试将其直接提供给 do
(删除 fit =
)。
扩充函数需要将 teamID
列保留为结果数据的一部分,即使它不在模型中也是如此。请注意,augment
使用第二个参数 data
正是出于此目的(有关更多信息,请参见 help(augment.lm)
)。
因此,新行看起来像:
do(augment(lm(win_percentage ~ payroll_M, data = .), data = .))
生成的数据框将每个原始观察结果占一行,并将包括 teamID
以及残差和拟合值(允许您过滤 CLE
)。
如何在以下线性模型中提取特定棒球队的残差数据?例如,我将如何提取 "CLE" 的残差?
library(Lahman)
library(dplyr)
library(broom)
# create baseball team data
data(Teams)
teams <- Teams
teams <- teams %>% mutate(win_percentage = (W / (W + L)) * 100)
# summarize baseball team salary by year
salaries <- Salaries
salaries <- salaries %>%
group_by(teamID, yearID, lgID) %>%
summarise(payroll_M = sum(as.numeric(salary)) / 10^6) %>%
ungroup()
# add winning percentage to the salary table
salaries <- teams %>%
select(yearID, teamID, win_percentage) %>%
right_join(salaries, by = c("yearID", "teamID"))
# compute linear model of winning vs team salary
model <- salaries %>%
group_by(yearID) %>%
do(fit = augment(lm(win_percentage ~ payroll_M, data = .)))
# extract residuals for Cleveland ??????
你很接近,但需要对 augment
行进行两处更改。
您正在将生成的(增强的)数据框保存到名为
fit
的列中。相反,请尝试将其直接提供给do
(删除fit =
)。扩充函数需要将
teamID
列保留为结果数据的一部分,即使它不在模型中也是如此。请注意,augment
使用第二个参数data
正是出于此目的(有关更多信息,请参见help(augment.lm)
)。
因此,新行看起来像:
do(augment(lm(win_percentage ~ payroll_M, data = .), data = .))
生成的数据框将每个原始观察结果占一行,并将包括 teamID
以及残差和拟合值(允许您过滤 CLE
)。