在面板数据中创建滞后 (t-1) 自变量
Creating lagged (t-1) independent variables in Panel data
假设我要回归预测模型:Return_t = x + Volume_t-1 + Volatility_t-1 + e。我有一个 5 年的每周面板数据,其中有 28 家公司已经在 excel 中准备好了,看起来像这样:
ID Date Return Volume Volatility
1 2012-01-10 0.039441572 0.6979594 0.2606079
1 2012-01-17 -0.021107681 0.6447289 0.3741519
1 2012-01-24 0.004798082 1.0072677 0.3097104
1 2012-01-31 0.001559987 1.0066153 0.2761096
1 2012-02-07 -0.009058289 0.7218983 0.2592109
1 2012-02-14 0.046404936 1.2879986 0.4304542
2 2012-01-10 0.02073912 -0.141970906 0.2573633
2 2012-01-17 -0.00369127 0.007792180 0.3360240
2 2012-01-24 -0.05881038 0.001347634 0.2163933
2 2012-01-31 -0.05664598 0.640085029 0.3545598
2 2012-02-07 0.03654193 0.360513703 0.3594383
2 2012-02-14 0.03092432 0.105669775 0.3043643
我想将自变量滞后设置为 t-1,哪个包允许我在 R 中这样做?我要 运行 一个固定效应的面板数据回归。
按'ID'分组后,我们可以使用dplyr
中的lag
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(Volume_1 = lag(Volume), Volatility_1 = lag(Volatility))
或者另一个选项是 shift
来自 data.table
library(data.table)
nm1 <- c("Volume", "Volatility")
setDT(df1)[, paste0(nm1, "_1") := lapply(.SD, shift), by = ID, .SDcols = nm1]
您也可以使用mutate_at
然后加入:
df %>%
mutate_at(4:5, lag) %>%
left_join(df, ., by = c('ID','Date','Return'))
输出:
ID Date Return Volume.x Volatility.x Volume.y Volatility.y
1 1 2012-01-10 0.039441572 0.697959400 0.2606079 NA NA
2 1 2012-01-17 -0.021107681 0.644728900 0.3741519 0.697959400 0.2606079
3 1 2012-01-24 0.004798082 1.007267700 0.3097104 0.644728900 0.3741519
4 1 2012-01-31 0.001559987 1.006615300 0.2761096 1.007267700 0.3097104
5 1 2012-02-07 -0.009058289 0.721898300 0.2592109 1.006615300 0.2761096
6 1 2012-02-14 0.046404936 1.287998600 0.4304542 0.721898300 0.2592109
7 2 2012-01-10 0.020739120 -0.141970906 0.2573633 1.287998600 0.4304542
8 2 2012-01-17 -0.003691270 0.007792180 0.3360240 -0.141970906 0.2573633
9 2 2012-01-24 -0.058810380 0.001347634 0.2163933 0.007792180 0.3360240
10 2 2012-01-31 -0.056645980 0.640085029 0.3545598 0.001347634 0.2163933
11 2 2012-02-07 0.036541930 0.360513703 0.3594383 0.640085029 0.3545598
12 2 2012-02-14 0.030924320 0.105669775 0.3043643 0.360513703 0.3594383
假设我要回归预测模型:Return_t = x + Volume_t-1 + Volatility_t-1 + e。我有一个 5 年的每周面板数据,其中有 28 家公司已经在 excel 中准备好了,看起来像这样:
ID Date Return Volume Volatility
1 2012-01-10 0.039441572 0.6979594 0.2606079
1 2012-01-17 -0.021107681 0.6447289 0.3741519
1 2012-01-24 0.004798082 1.0072677 0.3097104
1 2012-01-31 0.001559987 1.0066153 0.2761096
1 2012-02-07 -0.009058289 0.7218983 0.2592109
1 2012-02-14 0.046404936 1.2879986 0.4304542
2 2012-01-10 0.02073912 -0.141970906 0.2573633
2 2012-01-17 -0.00369127 0.007792180 0.3360240
2 2012-01-24 -0.05881038 0.001347634 0.2163933
2 2012-01-31 -0.05664598 0.640085029 0.3545598
2 2012-02-07 0.03654193 0.360513703 0.3594383
2 2012-02-14 0.03092432 0.105669775 0.3043643
我想将自变量滞后设置为 t-1,哪个包允许我在 R 中这样做?我要 运行 一个固定效应的面板数据回归。
按'ID'分组后,我们可以使用dplyr
lag
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(Volume_1 = lag(Volume), Volatility_1 = lag(Volatility))
或者另一个选项是 shift
来自 data.table
library(data.table)
nm1 <- c("Volume", "Volatility")
setDT(df1)[, paste0(nm1, "_1") := lapply(.SD, shift), by = ID, .SDcols = nm1]
您也可以使用mutate_at
然后加入:
df %>%
mutate_at(4:5, lag) %>%
left_join(df, ., by = c('ID','Date','Return'))
输出:
ID Date Return Volume.x Volatility.x Volume.y Volatility.y
1 1 2012-01-10 0.039441572 0.697959400 0.2606079 NA NA
2 1 2012-01-17 -0.021107681 0.644728900 0.3741519 0.697959400 0.2606079
3 1 2012-01-24 0.004798082 1.007267700 0.3097104 0.644728900 0.3741519
4 1 2012-01-31 0.001559987 1.006615300 0.2761096 1.007267700 0.3097104
5 1 2012-02-07 -0.009058289 0.721898300 0.2592109 1.006615300 0.2761096
6 1 2012-02-14 0.046404936 1.287998600 0.4304542 0.721898300 0.2592109
7 2 2012-01-10 0.020739120 -0.141970906 0.2573633 1.287998600 0.4304542
8 2 2012-01-17 -0.003691270 0.007792180 0.3360240 -0.141970906 0.2573633
9 2 2012-01-24 -0.058810380 0.001347634 0.2163933 0.007792180 0.3360240
10 2 2012-01-31 -0.056645980 0.640085029 0.3545598 0.001347634 0.2163933
11 2 2012-02-07 0.036541930 0.360513703 0.3594383 0.640085029 0.3545598
12 2 2012-02-14 0.030924320 0.105669775 0.3043643 0.360513703 0.3594383