R 中面板数据的回归

Regression in R on panel data

我想运行面板数据的线性回归。到目前为止,在我的代码下面,但是,我不明白为什么不返回 fit 和 rsq。有什么建议吗?

示例代码:

for(i in names(df))
{ 
  if(is.numeric(df[3,i]))  ##if row 3 is numeric, the entire column is 
  {       
    fit <- lm(df[3,i] ~ Gender, data=df) #does a regression for each column in my csv file against my independent variable 'etch'
    rsq <- summary(fit)$r.squared
  }
}

数据结构

示例数据:

    df<-structure(list(id = c(1, 1, 2, 2, 2), id1 = c(1, 2, 1, 2, 3), 
    a1 = c(5, 8, 7, 6, 3), a2 = c(1, 4, 3, 10, 5), a3 = c(2, 
    34, 3, 12, 6), a4 = c(9, 2, 3, 12, 7), a5 = c(0, 0, 0, 7, 
    8), a6 = c(7, 7, 0, 0, 9), a7 = c(5, 8, 7, 6, 0), a8 = c(1, 
    4, 3, 10, 3), a9 = c(2, 34, 3, 12, 3), a10 = c(9, 2, 3, 12, 
    3), Gender = c(1, 2, 1, 1, 2)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), spec = structure(list(
    cols = list(id = structure(list(), class = c("collector_double", 
    "collector")), id1 = structure(list(), class = c("collector_double", 
    "collector")), a1 = structure(list(), class = c("collector_double", 
    "collector")), a2 = structure(list(), class = c("collector_double", 
    "collector")), a3 = structure(list(), class = c("collector_double", 
    "collector")), a4 = structure(list(), class = c("collector_double", 
    "collector")), a5 = structure(list(), class = c("collector_double", 
    "collector")), a6 = structure(list(), class = c("collector_double", 
    "collector")), a7 = structure(list(), class = c("collector_double", 
    "collector")), a8 = structure(list(), class = c("collector_double", 
    "collector")), a9 = structure(list(), class = c("collector_double", 
    "collector")), a10 = structure(list(), class = c("collector_double", 
    "collector")), Gender = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

要在每一列上拟合线性回归,您可以使用 lapply。我们使用 reformulate 从列名创建一个公式对象并在 lapply 中使用它。可以从每个模型的 summary 中提取 R 平方值。

cols <- grep('a\d+', names(df), value = TRUE)
cols
#[1] "a1"  "a2"  "a3"  "a4"  "a5"  "a6"  "a7"  "a8"  "a9"  "a10"

lapply(cols, function(x) {
  lm(reformulate('Gender', x), df)
}) -> fit

r.squared <- sapply(fit, function(x) summary(x)$r.squared)