如何从 R 中的线性模型中提取特定残差数据

Question

如何在以下线性模型中提取特定棒球队的残差数据？例如，我将如何提取 "CLE" 的残差？

library(Lahman)
library(dplyr)
library(broom)

# create baseball team data
data(Teams)
teams <- Teams
teams <- teams %>% mutate(win_percentage = (W / (W + L)) * 100)

# summarize baseball team salary by year
salaries <- Salaries
salaries <- salaries %>% 
  group_by(teamID, yearID, lgID) %>%
  summarise(payroll_M = sum(as.numeric(salary)) / 10^6) %>% 
  ungroup()

# add winning percentage to the salary table
salaries <- teams %>% 
  select(yearID, teamID, win_percentage) %>% 
  right_join(salaries, by = c("yearID", "teamID"))

# compute linear model of winning vs team salary
model <- salaries %>% 
  group_by(yearID) %>%
  do(fit = augment(lm(win_percentage ~ payroll_M, data = .)))

# extract residuals for Cleveland ??????

Answer 1

你很接近，但需要对 augment 行进行两处更改。

您正在将生成的（增强的）数据框保存到名为 fit 的列中。相反，请尝试将其直接提供给 do（删除 fit =）。
扩充函数需要将 teamID 列保留为结果数据的一部分，即使它不在模型中也是如此。请注意，augment 使用第二个参数 data 正是出于此目的（有关更多信息，请参见 help(augment.lm)）。

因此，新行看起来像：

do(augment(lm(win_percentage ~ payroll_M, data = .), data = .))

生成的数据框将每个原始观察结果占一行，并将包括 teamID 以及残差和拟合值（允许您过滤 CLE）。

如何从 R 中的线性模型中提取特定残差数据

How to extract specific residual data from a linear model in R

r

dplyr

broom