R 中面板数据的回归
Regression in R on panel data
我想运行面板数据的线性回归。到目前为止,在我的代码下面,但是,我不明白为什么不返回 fit 和 rsq。有什么建议吗?
示例代码:
for(i in names(df))
{
if(is.numeric(df[3,i])) ##if row 3 is numeric, the entire column is
{
fit <- lm(df[3,i] ~ Gender, data=df) #does a regression for each column in my csv file against my independent variable 'etch'
rsq <- summary(fit)$r.squared
}
}
数据结构
示例数据:
df<-structure(list(id = c(1, 1, 2, 2, 2), id1 = c(1, 2, 1, 2, 3),
a1 = c(5, 8, 7, 6, 3), a2 = c(1, 4, 3, 10, 5), a3 = c(2,
34, 3, 12, 6), a4 = c(9, 2, 3, 12, 7), a5 = c(0, 0, 0, 7,
8), a6 = c(7, 7, 0, 0, 9), a7 = c(5, 8, 7, 6, 0), a8 = c(1,
4, 3, 10, 3), a9 = c(2, 34, 3, 12, 3), a10 = c(9, 2, 3, 12,
3), Gender = c(1, 2, 1, 1, 2)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), spec = structure(list(
cols = list(id = structure(list(), class = c("collector_double",
"collector")), id1 = structure(list(), class = c("collector_double",
"collector")), a1 = structure(list(), class = c("collector_double",
"collector")), a2 = structure(list(), class = c("collector_double",
"collector")), a3 = structure(list(), class = c("collector_double",
"collector")), a4 = structure(list(), class = c("collector_double",
"collector")), a5 = structure(list(), class = c("collector_double",
"collector")), a6 = structure(list(), class = c("collector_double",
"collector")), a7 = structure(list(), class = c("collector_double",
"collector")), a8 = structure(list(), class = c("collector_double",
"collector")), a9 = structure(list(), class = c("collector_double",
"collector")), a10 = structure(list(), class = c("collector_double",
"collector")), Gender = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
要在每一列上拟合线性回归,您可以使用 lapply
。我们使用 reformulate
从列名创建一个公式对象并在 lapply
中使用它。可以从每个模型的 summary
中提取 R 平方值。
cols <- grep('a\d+', names(df), value = TRUE)
cols
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8" "a9" "a10"
lapply(cols, function(x) {
lm(reformulate('Gender', x), df)
}) -> fit
r.squared <- sapply(fit, function(x) summary(x)$r.squared)
我想运行面板数据的线性回归。到目前为止,在我的代码下面,但是,我不明白为什么不返回 fit 和 rsq。有什么建议吗?
示例代码:
for(i in names(df))
{
if(is.numeric(df[3,i])) ##if row 3 is numeric, the entire column is
{
fit <- lm(df[3,i] ~ Gender, data=df) #does a regression for each column in my csv file against my independent variable 'etch'
rsq <- summary(fit)$r.squared
}
}
数据结构
示例数据:
df<-structure(list(id = c(1, 1, 2, 2, 2), id1 = c(1, 2, 1, 2, 3),
a1 = c(5, 8, 7, 6, 3), a2 = c(1, 4, 3, 10, 5), a3 = c(2,
34, 3, 12, 6), a4 = c(9, 2, 3, 12, 7), a5 = c(0, 0, 0, 7,
8), a6 = c(7, 7, 0, 0, 9), a7 = c(5, 8, 7, 6, 0), a8 = c(1,
4, 3, 10, 3), a9 = c(2, 34, 3, 12, 3), a10 = c(9, 2, 3, 12,
3), Gender = c(1, 2, 1, 1, 2)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), spec = structure(list(
cols = list(id = structure(list(), class = c("collector_double",
"collector")), id1 = structure(list(), class = c("collector_double",
"collector")), a1 = structure(list(), class = c("collector_double",
"collector")), a2 = structure(list(), class = c("collector_double",
"collector")), a3 = structure(list(), class = c("collector_double",
"collector")), a4 = structure(list(), class = c("collector_double",
"collector")), a5 = structure(list(), class = c("collector_double",
"collector")), a6 = structure(list(), class = c("collector_double",
"collector")), a7 = structure(list(), class = c("collector_double",
"collector")), a8 = structure(list(), class = c("collector_double",
"collector")), a9 = structure(list(), class = c("collector_double",
"collector")), a10 = structure(list(), class = c("collector_double",
"collector")), Gender = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
要在每一列上拟合线性回归,您可以使用 lapply
。我们使用 reformulate
从列名创建一个公式对象并在 lapply
中使用它。可以从每个模型的 summary
中提取 R 平方值。
cols <- grep('a\d+', names(df), value = TRUE)
cols
#[1] "a1" "a2" "a3" "a4" "a5" "a6" "a7" "a8" "a9" "a10"
lapply(cols, function(x) {
lm(reformulate('Gender', x), df)
}) -> fit
r.squared <- sapply(fit, function(x) summary(x)$r.squared)