R 中用于计算过程表单数据的分层对数秩检验?
Stratified log-rank test in R for counting process form data?
背景:半年随访4y,患者可能会换药。为了解决这个问题,我将生存数据转换为计数过程形式。我想比较药物组 A、B 和 C 的生存曲线。我使用的是扩展 Cox 模型,但想对每个危险函数进行成对比较或进行分层对数秩检验。我认为 pairwise_survdiff
由于我的数据形式而引发错误。
示例数据:
x<-data.frame(tstart=rep(seq(0,18,6),3),tstop=rep(seq(6,24,6),3), rx = rep(c("A","B","C"),4), death=c(rep(0,11),1))
x
问题:
在 survival
包中使用 survdiff
时,
survdiff(Surv(tstart,tstop,death) ~ rx, data = x)
我收到错误:
Error in survdiff(Surv(tstart, tstop, death) ~ rx, data = x) :
Right censored data only
我认为这源于计数过程形式,因为我在网上找不到比较时变协变量生存曲线的示例。
问题:是否有快速解决此问题的方法?或者,是否有替代方法 package/function 具有相同的通用性来比较生存曲线,即使用不同的方法?如何使用 survidff
对计算过程表单数据实施分层对数秩检验?
注意:这在 survminer 包中被标记为已知问题,请参阅此处的 github 问题,但更新 survminer 并没有解决我的问题,并使用一个时间间隔,tstop-tstart 是不正确的,因为这会留下例如 6 个月的多个条目,而不是实际的风险间隔。
因此,这是一个使用 multcomp
包拟合模型并进行多重比较的示例。请注意,这隐含地假设治疗 A-C 是随机的。根据对过程的假设,最好拟合具有处理和结果之间转换的多状态模型。
library(purrr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(survival)
library(multcomp)
#> Loading required package: mvtnorm
#> Loading required package: TH.data
#> Loading required package: MASS
#>
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#>
#> select
#>
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#>
#> geyser
# simulate survival data
set.seed(123)
n <- 200
df <- data.frame(
id = rep(1:n, each = 8),
start = rep(seq(0, 42, by = 6), times = 8),
stop = rep(seq(6, 48, by = 6), times = 8),
rx = sample(LETTERS[1:3], n * 8, replace = T))
df$hazard <- exp(-3.5 -1 * (df$rx == "A") + .5 * (df$rx == "B") +
.5 * (df$rx == "C"))
df_surv <- data.frame(id = 1:n)
df_surv$time <- split(df, f = df$id) %>%
map_dbl(~msm::rpexp(n = 1, rate = .x$hazard, t = .x$start))
df <- df %>% left_join(df_surv)
#> Joining, by = "id"
df <- df %>%
mutate(status = 1L * (time <= stop)) %>%
filter(start <= time)
df %>% head()
#> id start stop rx hazard time status
#> 1 1 0 6 A 0.01110900 13.78217 0
#> 2 1 6 12 C 0.04978707 13.78217 0
#> 3 1 12 18 B 0.04978707 13.78217 1
#> 4 2 0 6 B 0.04978707 22.37251 0
#> 5 2 6 12 B 0.04978707 22.37251 0
#> 6 2 12 18 C 0.04978707 22.37251 0
# fit the model
model <- coxph(Surv(start, stop, status)~rx, data = df)
# define pairwise comparison
glht_rx <- multcomp::glht(model, linfct=multcomp::mcp(rx="Tukey"))
glht_rx
#>
#> General Linear Hypotheses
#>
#> Multiple Comparisons of Means: Tukey Contrasts
#>
#>
#> Linear Hypotheses:
#> Estimate
#> B - A == 0 1.68722
#> C - A == 0 1.60902
#> C - B == 0 -0.07819
# perform multiple comparisons
# (adjusts for multiple comparisons + takes into account correlation of coefficients -> more power than e.g. bonferroni)
smry_rx <- summary(glht_rx)
smry_rx # -> B and C different to A, but not from each other
#>
#> Simultaneous Tests for General Linear Hypotheses
#>
#> Multiple Comparisons of Means: Tukey Contrasts
#>
#>
#> Fit: coxph(formula = Surv(start, stop, status) ~ rx, data = df)
#>
#> Linear Hypotheses:
#> Estimate Std. Error z value Pr(>|z|)
#> B - A == 0 1.68722 0.28315 5.959 <1e-05 ***
#> C - A == 0 1.60902 0.28405 5.665 <1e-05 ***
#> C - B == 0 -0.07819 0.16509 -0.474 0.88
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)
# confidence intervals
plot(smry_rx)
由 reprex package (v0.2.1)
于 2019-04-01 创建
背景:半年随访4y,患者可能会换药。为了解决这个问题,我将生存数据转换为计数过程形式。我想比较药物组 A、B 和 C 的生存曲线。我使用的是扩展 Cox 模型,但想对每个危险函数进行成对比较或进行分层对数秩检验。我认为 pairwise_survdiff
由于我的数据形式而引发错误。
示例数据:
x<-data.frame(tstart=rep(seq(0,18,6),3),tstop=rep(seq(6,24,6),3), rx = rep(c("A","B","C"),4), death=c(rep(0,11),1))
x
问题:
在 survival
包中使用 survdiff
时,
survdiff(Surv(tstart,tstop,death) ~ rx, data = x)
我收到错误:
Error in survdiff(Surv(tstart, tstop, death) ~ rx, data = x) :
Right censored data only
我认为这源于计数过程形式,因为我在网上找不到比较时变协变量生存曲线的示例。
问题:是否有快速解决此问题的方法?或者,是否有替代方法 package/function 具有相同的通用性来比较生存曲线,即使用不同的方法?如何使用 survidff
对计算过程表单数据实施分层对数秩检验?
注意:这在 survminer 包中被标记为已知问题,请参阅此处的 github 问题,但更新 survminer 并没有解决我的问题,并使用一个时间间隔,tstop-tstart 是不正确的,因为这会留下例如 6 个月的多个条目,而不是实际的风险间隔。
因此,这是一个使用 multcomp
包拟合模型并进行多重比较的示例。请注意,这隐含地假设治疗 A-C 是随机的。根据对过程的假设,最好拟合具有处理和结果之间转换的多状态模型。
library(purrr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(survival)
library(multcomp)
#> Loading required package: mvtnorm
#> Loading required package: TH.data
#> Loading required package: MASS
#>
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#>
#> select
#>
#> Attaching package: 'TH.data'
#> The following object is masked from 'package:MASS':
#>
#> geyser
# simulate survival data
set.seed(123)
n <- 200
df <- data.frame(
id = rep(1:n, each = 8),
start = rep(seq(0, 42, by = 6), times = 8),
stop = rep(seq(6, 48, by = 6), times = 8),
rx = sample(LETTERS[1:3], n * 8, replace = T))
df$hazard <- exp(-3.5 -1 * (df$rx == "A") + .5 * (df$rx == "B") +
.5 * (df$rx == "C"))
df_surv <- data.frame(id = 1:n)
df_surv$time <- split(df, f = df$id) %>%
map_dbl(~msm::rpexp(n = 1, rate = .x$hazard, t = .x$start))
df <- df %>% left_join(df_surv)
#> Joining, by = "id"
df <- df %>%
mutate(status = 1L * (time <= stop)) %>%
filter(start <= time)
df %>% head()
#> id start stop rx hazard time status
#> 1 1 0 6 A 0.01110900 13.78217 0
#> 2 1 6 12 C 0.04978707 13.78217 0
#> 3 1 12 18 B 0.04978707 13.78217 1
#> 4 2 0 6 B 0.04978707 22.37251 0
#> 5 2 6 12 B 0.04978707 22.37251 0
#> 6 2 12 18 C 0.04978707 22.37251 0
# fit the model
model <- coxph(Surv(start, stop, status)~rx, data = df)
# define pairwise comparison
glht_rx <- multcomp::glht(model, linfct=multcomp::mcp(rx="Tukey"))
glht_rx
#>
#> General Linear Hypotheses
#>
#> Multiple Comparisons of Means: Tukey Contrasts
#>
#>
#> Linear Hypotheses:
#> Estimate
#> B - A == 0 1.68722
#> C - A == 0 1.60902
#> C - B == 0 -0.07819
# perform multiple comparisons
# (adjusts for multiple comparisons + takes into account correlation of coefficients -> more power than e.g. bonferroni)
smry_rx <- summary(glht_rx)
smry_rx # -> B and C different to A, but not from each other
#>
#> Simultaneous Tests for General Linear Hypotheses
#>
#> Multiple Comparisons of Means: Tukey Contrasts
#>
#>
#> Fit: coxph(formula = Surv(start, stop, status) ~ rx, data = df)
#>
#> Linear Hypotheses:
#> Estimate Std. Error z value Pr(>|z|)
#> B - A == 0 1.68722 0.28315 5.959 <1e-05 ***
#> C - A == 0 1.60902 0.28405 5.665 <1e-05 ***
#> C - B == 0 -0.07819 0.16509 -0.474 0.88
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> (Adjusted p values reported -- single-step method)
# confidence intervals
plot(smry_rx)
由 reprex package (v0.2.1)
于 2019-04-01 创建