为R中不同时间点的多列生成斜率

Question

我的数据集看起来像：

Sub session timepoint col1  col2   ... coln
001   1      1/2/2000 122    73   
001   2      2/7/2008 131    65
002   1      3/5/2002 80     55
002   2      5/8/2020 67     45
003   1      6/7/2011 99     67
003   2      8/8/2019 111    77

我想应用 lm(y~x) 并得到系数 lm(y~x)$coefficient[[2]] 的数据框，例如对于 Sub 001，斜率为 col1:lm((131-122)~(date(2/7/2008)-date(1/2/2000))

输出像

Sub  col1_lmcoefficient  col2_lmcoefficient   ... coln_coefficient
001   0.0030   -0.0027   ...
002   -0.0019   -0.0015   ...
003   0.0040    0.0034  ...

我无法将数据转换为时间差和列差并应用 lm，因为它不会为每一行和每一列生成数据。关于如何进行此分析的任何建议

Answer 1

这是一种使用 tidyverse 和 broom 的方法，基于教程 here。

首先，我将数据重新整形以使其更长，以便更轻松地将相同的方法应用于每一列数据。然后我嵌套时间和值数据，映射到 lm，并使用 broom::tidy 提取系数。剩下的就是过滤/重塑输出。

library(tidyverse); library(broom)
df1 %>%
  pivot_longer(-c(Sub:timepoint)) %>%
  nest(data = -c(Sub, name)) %>% 
  mutate(fit = map(data, ~lm(value ~ timepoint, data = .x)),
         tidied = map(fit, tidy)) %>%
  unnest(tidied) %>%
  filter(term == "timepoint") %>%
  select(Sub, name, estimate) %>%
  pivot_wider(names_from = name, values_from = estimate)

# A tibble: 3 x 3
    Sub     col1     col2
  <int>    <dbl>    <dbl>
1     1  0.00304 -0.00270
2     2 -0.00196 -0.00151
3     3  0.00402  0.00335

示例数据：

df1 <- structure(list(Sub = c(1L, 1L, 2L, 2L, 3L, 3L), session = c(1L, 
2L, 1L, 2L, 1L, 2L), timepoint = structure(c(10958, 13916, 11751, 
18390, 15132, 18116), class = "Date"), col1 = c(122L, 131L, 80L, 
67L, 99L, 111L), col2 = c(73L, 65L, 55L, 45L, 67L, 77L)), class = "data.frame", row.names = c(NA, 
-6L))

为R中不同时间点的多列生成斜率

generate slope for multiple columns for different timepoints in R

r

linear-regression

coefficients