ggsurvplot_facet returns: "Error in grDevices::col2rgb(colour, TRUE) : invalid color name" 在函数内部使用时

ggsurvplot_facet returns: "Error in grDevices::col2rgb(colour, TRUE) : invalid color name" when used inside a function

我正在尝试使用 ggsurvplot_facet() 函数根据变量性别绘制多个变量的生存曲线。当我将我的代码应用于单个拟合模型时,它工作正常。但是,当我尝试在一个函数或一个 for 循环中使用相同的代码时,它无法绘制所有应该绘制的生存曲线并且出现 returns 错误。如果它允许输入生存元素列表,我会在 ggsurvplot_facet() 本身中执行此绘图,就像 ggsurvplot() 一样,但 ggsurvplot_facet() 只允许单个生存元素一次。

我是 运行 我在 2018 MacBook Pro 中 运行 我的代码 Mac OS High Sierra。

考虑以下数据集:http://s000.tinyupload.com/index.php?file_id=01704535336107726906

它包含对 100 名受试者和 4 个不同变量的多次访问的观察结果。其中两个变量(variable1 和 variable2)可以有两个不同的值(0 或 1),另外两个变量(variable3 和 variable4)可以有三个不同的值(0、1 或 2)。

我已经开始使用可以具有两个不同值的那些,并且我编写了以下代码:

# Load libraries
require(mgcv)
require(msm)
library(dplyr)
library(grDevices)
library(survival)
library(survminer)


# Set working directory
dirname<-dirname(rstudioapi::getSourceEditorContext()$path)
setwd(dirname)


load("ggsurvplot_facet_error.rda")


fit_test <- survfit(
  Surv(follow_up, as.numeric(status)) ~ (sex + variable1), data = data)

plot_test <- ggsurvplot_facet(fit_test,
                                     data = data,
                                     pval = TRUE,
                                     conf.int = TRUE,
                                     surv.median.line = "hv", # Specify median survival
                                     break.time.by = 1,
                                     facet.by = "sex",
                                     ggtheme = theme_bw(), # Change ggplot2 theme
                                     palette = "aaas",
                                     legend = "bottom",
                                     xlab = "Time (years)",
                                     ylab = "Death probability",
                                     panel.labs = list(sex_recoded=c("Male", "Female")),
                                     legend.labs = c("A", "B")
) 

plot_test

此代码运行良好并生成以下图:

但是,当我尝试将此代码转换为函数或 FOR 循环,以便将相同的代码应用于变量 1 和变量 2 时,绘图的 color/palette 部分总是出错步骤。

# Variables_with_2_categories:  variable1 and variable2
two <- c("variable1", "variable2")

## TEST #1: USING A FUNCTION

fit_plot_function <- function(x) {

# FIT part of the function
  two.i <- two[i]

fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~ 
                        sex + eval(as.name(paste0(two.i))), data = data)

# PLOT part of the function
  plot_temp <- ggsurvplot_facet(fit_temp,
                                data = data,
                                pval = TRUE,
                                conf.int = TRUE,
                                surv.median.line = "hv", # Specify median survival
                                break.time.by = 1,
                                facet.by = "sex",
                                ggtheme = theme_bw(), # Change ggplot2 theme
                                palette = "aaas",
                                legend = "bottom",
                                xlab = "Time (years)",
                                ylab = "Death probability",
                                panel.labs = list(sex_recoded=c("Male", "Female")),
                                legend.labs = rep(c("A", "B"),2)
  ) 
}


fit_plot_function(two)
# Warning message:
#  Now, to change color palette, use the argument palette= 
#  'eval(as.name(paste0(two.i)))' instead of color = 'eval(as.name(paste0(two.i)))' 

print(plot_temp)

# Error in grDevices::col2rgb(colour, TRUE) : 
#  invalid color name 'eval(as.name(paste0(two.i)))'

当它评估用向量解析的变量名称时,它似乎无法识别变量名称。对于 FOR 循环,情况完全相同:

## TEST #2: USING A FOR LOOP

n.two <- length(two)

for(i in 1:n.two) {
  two.i <- two[i]

  fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~ 
                        (sex + eval(as.name(paste0(two.i)))), data = data)



  plot_temp <- ggsurvplot_facet(fit_temp,
                                data = data,
                                pval = TRUE,
                                conf.int = TRUE,
                                surv.median.line = "hv", # Specify median survival
                                break.time.by = 1,
                                facet.by = "sex",
                                ggtheme = theme_bw(), # Change ggplot2 theme
                                palette = "aaas",
                                legend = "bottom",
                                xlab = "Time (years)",
                                ylab = "Death probability",
                                panel.labs = list(sex_recoded=c("Male", "Female")),
                                legend.labs = rep(c("A", "B"),2)
    ) 
}

print(plot_temp)

# ERROR: Now, to change color palette, use the argument palette= 'eval(as.name(paste0(two.i)))' 
# instead of color = 'eval(as.name(paste0(two.i)))

作为附加评论,如果我可以将相同的代码应用于同时具有两个、两个或三个不同值的变量,而不是必须为每个变量应用不同的函数,那就太好了他们。

非常感谢您的帮助,

此致,

Yatrosin

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] survminer_0.4.3.999 ggpubr_0.2          magrittr_1.5        ggplot2_3.1.1       survival_2.44-1.1  
[6] dplyr_0.8.0.1       msm_1.6.7           mgcv_1.8-27         nlme_3.1-137       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1        pillar_1.3.1      compiler_3.5.1    plyr_1.8.4        tools_3.5.1       digest_0.6.18    
 [7] tibble_2.1.1      gtable_0.3.0      lattice_0.20-38   pkgconfig_2.0.2   rlang_0.3.4       Matrix_1.2-17    
[13] ggsci_2.9         rstudioapi_0.10   cmprsk_2.2-7      yaml_2.2.0        mvtnorm_1.0-10    expm_0.999-4     
[19] xfun_0.6          gridExtra_2.3     knitr_1.22        withr_2.1.2       survMisc_0.5.5    generics_0.0.2   
[25] grid_3.5.1        tidyselect_0.2.5  data.table_1.12.2 glue_1.3.1        KMsurv_0.1-5      R6_2.4.0         
[31] km.ci_0.5-2       purrr_0.3.2       tidyr_0.8.3       scales_1.0.0      backports_1.1.4   splines_3.5.1    
[37] assertthat_0.2.1  xtable_1.8-3      colorspace_1.4-1  labeling_0.3      lazyeval_0.2.2    munsell_0.5.0    
[43] broom_0.5.2       crayon_1.3.4      zoo_1.8-5   

是时候发出呼噜声了。用 purrr 就可以完成你想要的。您可以阅读有关制作 ggplot2 purrr here and more examples here.

首先,我们需要使用 tidyr::gather 将您的数据转换为长格式。除了变量 1、2、3、4 之外,我们将保留数据框中的所有内容。他们会融化的。

library(tidyr)
library(dplyr)
library(purrr)

data %>% 
  gather(num, variable, -sample_id,  -sex,
         -visit_number, -age_at_enrollment,
         -follow_up, -status) %>% 
  mutate(num2 = num) %>% # We'll need this column later for the titles
  as_tibble() -> long_data


# A tibble: 2,028 x 8
   sample_id   sex    visit_number age_at_enrollment follow_up status num       variable
   <fct>       <fct>  <fct>                    <dbl>     <dbl> <fct>  <chr>        <int>
 1 sample_0001 Female 1                         56.7     0     1      variable1        0
 2 sample_0001 Female 2                         57.7     0.920 1      variable1        0
 3 sample_0001 Female 3                         58.6     1.90  1      variable1        0
 4 sample_0001 Female 4                         59.7     2.97  2      variable1        0
 5 sample_0001 Female 5                         60.7     4.01  1      variable1        0
 6 sample_0001 Female 6                         61.7     4.99  1      variable1        0
 7 sample_0002 Female 1                         55.9     0     1      variable1        1
 8 sample_0002 Female 2                         56.9     1.04  1      variable1        1
 9 sample_0002 Female 3                         58.0     2.15  1      variable1        1
10 sample_0002 Female 4                         59.0     3.08  1      variable1        1
# ... with 2,018 more rows

现在我们需要将我们的长数据帧转换为嵌套数据帧 map!使用 ggsurvplot 准确 — 此函数不支持在 nest().

期间创建的 tibbles
long_data %>% 
  group_by(num) %>% 
  nest() %>% 
  mutate(
    # Run survfit() for every variable
    fit_f = map(data, ~survfit(Surv(follow_up, as.numeric(status)) ~ (sex + variable), data = .)),
    # Create survplot for every variable and survfit
    plots = map2(fit_f, data, ~ggsurvplot(.x,
                                          as.data.frame(.y), # Important! convert from tibble to data.frame 
                                          pval = TRUE,
                                          conf.int = TRUE,
                                          facet.by = "sex",
                                          surv.median.line = "hv", 
                                          break.time.by = 1,
                                          ggtheme = theme_bw(),
                                          palette = "aaas",
                                          xlab = "Time (years)",
                                          ylab = "Death probability") +
                   ggtitle(paste0("This is plot of ", .y$num2)) + # Add a title
                   theme(legend.position = "bottom"))) -> plots

现在您可以 return 通过键入以下内容来绘制您的图:

plots$plots[[1]]
plots$plots[[2]]
plots$plots[[3]] 
plots$plots[[4]] # plotted below

并使用 map2()

保存所有绘图
map2(paste0(unique(long_data$num), ".pdf"), plots$plots, ggsave)

更新

不幸的是,我不知道如何更改图例标签。我可以建议的唯一解决方案如下。请记住 plots$plots[[…]]ggplot object,因此您可以在之后更改所有内容。例如,要更改图例标签,我只需要添加 scale_fill_discretescale_color_discrete。标题、实验室、主题等也可以这样做。

library(ggsci) # to add aaas color palette

plots$plots[[3]] +
  labs(title = "Variable 3",
       subtitle = "You just have to be the best") +
  ggsci::scale_color_aaas(guide = F) +
  ggsci::scale_fill_aaas(label = LETTERS[1:3])