在 R 中的简单线性回归中,如何重新调整年龄以估计每 year/5 years/10 年的 beta-coefficient?
Within a simple linear regression in R, how do I rescale age to estimate it's beta-coefficient per per year/5 years/10 years?
这可能是一个有点愚蠢的问题,但是在 SO 和其他网站上漫游我找不到一个简单的答案:我有关于年龄和连续结果之间关系的数据:
library(dplyr)
library(tidyverse)
library(magrittr)
mydata <-
structure(list(ID = c(104, 157, 52, 152, 114, 221, 320, 125,
75, 171, 80, 76, 258, 82, 142, 203, 37, 92, 202, 58, 194, 38,
4, 137, 25, 87, 40, 117, 21, 255, 277, 315, 96, 134, 185, 94,
3, 153, 172, 65, 279, 209, 60, 13, 154, 160, 24, 29, 159, 213,
127, 74, 48, 126, 184, 132, 61, 141, 27, 49, 8, 39, 164, 162,
34, 205, 179, 119, 77, 135, 138, 165, 103, 253, 14, 20, 310,
84, 30, 273, 22, 105, 262, 116, 86, 83, 145, 31, 95, 51, 81,
271, 36, 50, 189, 2, 115, 7, 197, 54), age = c(67.1, 70.7, 53,
61.7, 66.1, 57.7, 54.1, 67.2, 60.9, 55.8, 40.7, 57.6, 64.1, 70.7,
47.5, 46.3, 66.7, 55, 63.3, 68.2, 61.2, 60.5, 52, 65.3, 48.9,
56.9, 62.7, 75.2, 61.4, 57.9, 53.6, 58.1, 51, 67.3, 63.9, 57,
43.2, 64.7, 62.8, 56.3, 51.7, 39.4, 45.2, 57.8, 55.7, 69.6, 61.5,
50.1, 73.7, 55.5, 65.2, 54.6, 49, 35.2, 52.9, 46.3, 55, 52.5,
54.2, 61, 57.4, 56.5, 53.6, 47.7, 64.2, 53.4, 60.9, 58.2, 60.7,
50.3, 48.3, 74.7, 52.1, 59.9, 52.4, 70.8, 61.2, 66.5, 55.4, 57.5,
59.2, 60.1, 52.3, 60.2, 54.8, 36.3, 61.5, 48.6, 56, 62, 64.8,
40.4, 68.3, 60, 69.1, 56.6, 45.3, 58.5, 52.3, 52), continuous_outcome = c(3636.6,
1128.2, 2007.5, 802.9, 332.3, 2636.1, 169.5, 67.9, 3261.8, 1920.3,
155.2, 1677.2, 198.2, 11189.7, 560.9, 633.1, 196.1, 13.9, 100.7,
7594.5, 1039.8, 83.9, 2646.8, 284.6, 306, 1135.6, 1883.1, 5681.4,
1706.2, 2241.1, 97.7, 1106.8, 1107.1, 290.8, 2123.4, 267, 115.3,
138.5, 152.7, 1338.9, 6709.8, 561.7, 1931.7, 3112.4, 1876.3,
3795.9, 5706.7, 7.4, 1324.9, 4095.4, 205.4, 1886, 177.3, 304.4,
1319.1, 415.9, 537.2, 3141.1, 740, 1976.7, 624.8, 983.1, 1163.5,
1432.6, 3730.4, 2023.4, 498.2, 652.5, 982.7, 1345.3, 138.4, 1505.1,
3528.1, 11.9, 884.5, 10661.6, 1911.4, 2800.8, 81.5, 396.4, 409.1,
417.3, 186, 1892.4, 1689.7, 0, 210.1, 210.5, 3484.5, 3196.8,
57.2, 20.2, 947, 540, 1603.1, 1571.8, 9.1, 149.2, 122, 63.2)), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
正如您在小标题中所见,年龄是一个连续变量,测量精度为小数点后一位:
head(mydata)
# A tibble: 6 x 3
ID age continuous_outcome
<dbl> <dbl> <dbl>
1 104 67.1 3637.
2 157 70.7 1128.
3 52 53 2008.
4 152 61.7 803.
5 114 66.1 332.
6 221 57.7 2636.
当我拟合一个简单的线性回归时(现在假设所有假设都是not-violated),我得到以下结果beta-coefficient:
fit <-
lm(formula=continuous_outcome ~ age,
data=mydata)
fit
Call:
lm(formula = continuous_outcome ~ age, data = mydata)
Coefficients:
(Intercept) age
-3400.12 86.06
beta-coefficient 的年龄是 86.06
。这是否意味着,随着年龄的测量精确到小数点后一位,每增加 0.1 岁,我的结果就会增加 86.06?如果是这样,我如何重新调整 age
以便我可以衡量每 5 年或 10 年的年龄影响?
提前致谢!
beta 系数显示因变量(DV,在本例中 continuous_outcome
)随着自变量(IV,在本例中 age
每增加一个单位而增加的量=23=]年).
如果您想显示每 1/10 年的关系,请在拟合模型之前乘以 age
列,或者将 beta 系数除以 10。
根据您的具体要求,由于 beta 系数为 86.06,您可以将其乘以年数以获得连续变量的增加。所以:
- 1 年增长 = +86.06
- 5 年增长 = +430.3 (86.06 * 5)
- 10 年增长 = +860.6 (86.06 * 10)
要回答最后一个问题(每 5 年年龄影响的估计值),则为 430.3,即 86.06 * 5。因此,age
人每增加 5 年, continuous_outcome
平均增加430.3。
这可能是一个有点愚蠢的问题,但是在 SO 和其他网站上漫游我找不到一个简单的答案:我有关于年龄和连续结果之间关系的数据:
library(dplyr)
library(tidyverse)
library(magrittr)
mydata <-
structure(list(ID = c(104, 157, 52, 152, 114, 221, 320, 125,
75, 171, 80, 76, 258, 82, 142, 203, 37, 92, 202, 58, 194, 38,
4, 137, 25, 87, 40, 117, 21, 255, 277, 315, 96, 134, 185, 94,
3, 153, 172, 65, 279, 209, 60, 13, 154, 160, 24, 29, 159, 213,
127, 74, 48, 126, 184, 132, 61, 141, 27, 49, 8, 39, 164, 162,
34, 205, 179, 119, 77, 135, 138, 165, 103, 253, 14, 20, 310,
84, 30, 273, 22, 105, 262, 116, 86, 83, 145, 31, 95, 51, 81,
271, 36, 50, 189, 2, 115, 7, 197, 54), age = c(67.1, 70.7, 53,
61.7, 66.1, 57.7, 54.1, 67.2, 60.9, 55.8, 40.7, 57.6, 64.1, 70.7,
47.5, 46.3, 66.7, 55, 63.3, 68.2, 61.2, 60.5, 52, 65.3, 48.9,
56.9, 62.7, 75.2, 61.4, 57.9, 53.6, 58.1, 51, 67.3, 63.9, 57,
43.2, 64.7, 62.8, 56.3, 51.7, 39.4, 45.2, 57.8, 55.7, 69.6, 61.5,
50.1, 73.7, 55.5, 65.2, 54.6, 49, 35.2, 52.9, 46.3, 55, 52.5,
54.2, 61, 57.4, 56.5, 53.6, 47.7, 64.2, 53.4, 60.9, 58.2, 60.7,
50.3, 48.3, 74.7, 52.1, 59.9, 52.4, 70.8, 61.2, 66.5, 55.4, 57.5,
59.2, 60.1, 52.3, 60.2, 54.8, 36.3, 61.5, 48.6, 56, 62, 64.8,
40.4, 68.3, 60, 69.1, 56.6, 45.3, 58.5, 52.3, 52), continuous_outcome = c(3636.6,
1128.2, 2007.5, 802.9, 332.3, 2636.1, 169.5, 67.9, 3261.8, 1920.3,
155.2, 1677.2, 198.2, 11189.7, 560.9, 633.1, 196.1, 13.9, 100.7,
7594.5, 1039.8, 83.9, 2646.8, 284.6, 306, 1135.6, 1883.1, 5681.4,
1706.2, 2241.1, 97.7, 1106.8, 1107.1, 290.8, 2123.4, 267, 115.3,
138.5, 152.7, 1338.9, 6709.8, 561.7, 1931.7, 3112.4, 1876.3,
3795.9, 5706.7, 7.4, 1324.9, 4095.4, 205.4, 1886, 177.3, 304.4,
1319.1, 415.9, 537.2, 3141.1, 740, 1976.7, 624.8, 983.1, 1163.5,
1432.6, 3730.4, 2023.4, 498.2, 652.5, 982.7, 1345.3, 138.4, 1505.1,
3528.1, 11.9, 884.5, 10661.6, 1911.4, 2800.8, 81.5, 396.4, 409.1,
417.3, 186, 1892.4, 1689.7, 0, 210.1, 210.5, 3484.5, 3196.8,
57.2, 20.2, 947, 540, 1603.1, 1571.8, 9.1, 149.2, 122, 63.2)), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
正如您在小标题中所见,年龄是一个连续变量,测量精度为小数点后一位:
head(mydata)
# A tibble: 6 x 3
ID age continuous_outcome
<dbl> <dbl> <dbl>
1 104 67.1 3637.
2 157 70.7 1128.
3 52 53 2008.
4 152 61.7 803.
5 114 66.1 332.
6 221 57.7 2636.
当我拟合一个简单的线性回归时(现在假设所有假设都是not-violated),我得到以下结果beta-coefficient:
fit <-
lm(formula=continuous_outcome ~ age,
data=mydata)
fit
Call:
lm(formula = continuous_outcome ~ age, data = mydata)
Coefficients:
(Intercept) age
-3400.12 86.06
beta-coefficient 的年龄是 86.06
。这是否意味着,随着年龄的测量精确到小数点后一位,每增加 0.1 岁,我的结果就会增加 86.06?如果是这样,我如何重新调整 age
以便我可以衡量每 5 年或 10 年的年龄影响?
提前致谢!
beta 系数显示因变量(DV,在本例中 continuous_outcome
)随着自变量(IV,在本例中 age
每增加一个单位而增加的量=23=]年).
如果您想显示每 1/10 年的关系,请在拟合模型之前乘以 age
列,或者将 beta 系数除以 10。
根据您的具体要求,由于 beta 系数为 86.06,您可以将其乘以年数以获得连续变量的增加。所以:
- 1 年增长 = +86.06
- 5 年增长 = +430.3 (86.06 * 5)
- 10 年增长 = +860.6 (86.06 * 10)
要回答最后一个问题(每 5 年年龄影响的估计值),则为 430.3,即 86.06 * 5。因此,age
人每增加 5 年, continuous_outcome
平均增加430.3。