ggsurvplot_facet returns: "Error in grDevices::col2rgb(colour, TRUE) : invalid color name" 在函数内部使用时
ggsurvplot_facet returns: "Error in grDevices::col2rgb(colour, TRUE) : invalid color name" when used inside a function
我正在尝试使用 ggsurvplot_facet() 函数根据变量性别绘制多个变量的生存曲线。当我将我的代码应用于单个拟合模型时,它工作正常。但是,当我尝试在一个函数或一个 for 循环中使用相同的代码时,它无法绘制所有应该绘制的生存曲线并且出现 returns 错误。如果它允许输入生存元素列表,我会在 ggsurvplot_facet() 本身中执行此绘图,就像 ggsurvplot() 一样,但 ggsurvplot_facet() 只允许单个生存元素一次。
我是 运行 我在 2018 MacBook Pro 中 运行 我的代码 Mac OS High Sierra。
考虑以下数据集:http://s000.tinyupload.com/index.php?file_id=01704535336107726906
它包含对 100 名受试者和 4 个不同变量的多次访问的观察结果。其中两个变量(variable1 和 variable2)可以有两个不同的值(0 或 1),另外两个变量(variable3 和 variable4)可以有三个不同的值(0、1 或 2)。
我已经开始使用可以具有两个不同值的那些,并且我编写了以下代码:
# Load libraries
require(mgcv)
require(msm)
library(dplyr)
library(grDevices)
library(survival)
library(survminer)
# Set working directory
dirname<-dirname(rstudioapi::getSourceEditorContext()$path)
setwd(dirname)
load("ggsurvplot_facet_error.rda")
fit_test <- survfit(
Surv(follow_up, as.numeric(status)) ~ (sex + variable1), data = data)
plot_test <- ggsurvplot_facet(fit_test,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = c("A", "B")
)
plot_test
此代码运行良好并生成以下图:
但是,当我尝试将此代码转换为函数或 FOR 循环,以便将相同的代码应用于变量 1 和变量 2 时,绘图的 color/palette 部分总是出错步骤。
# Variables_with_2_categories: variable1 and variable2
two <- c("variable1", "variable2")
## TEST #1: USING A FUNCTION
fit_plot_function <- function(x) {
# FIT part of the function
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
sex + eval(as.name(paste0(two.i))), data = data)
# PLOT part of the function
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
fit_plot_function(two)
# Warning message:
# Now, to change color palette, use the argument palette=
# 'eval(as.name(paste0(two.i)))' instead of color = 'eval(as.name(paste0(two.i)))'
print(plot_temp)
# Error in grDevices::col2rgb(colour, TRUE) :
# invalid color name 'eval(as.name(paste0(two.i)))'
当它评估用向量解析的变量名称时,它似乎无法识别变量名称。对于 FOR 循环,情况完全相同:
## TEST #2: USING A FOR LOOP
n.two <- length(two)
for(i in 1:n.two) {
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
(sex + eval(as.name(paste0(two.i)))), data = data)
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
print(plot_temp)
# ERROR: Now, to change color palette, use the argument palette= 'eval(as.name(paste0(two.i)))'
# instead of color = 'eval(as.name(paste0(two.i)))
作为附加评论,如果我可以将相同的代码应用于同时具有两个、两个或三个不同值的变量,而不是必须为每个变量应用不同的函数,那就太好了他们。
非常感谢您的帮助,
此致,
Yatrosin
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] survminer_0.4.3.999 ggpubr_0.2 magrittr_1.5 ggplot2_3.1.1 survival_2.44-1.1
[6] dplyr_0.8.0.1 msm_1.6.7 mgcv_1.8-27 nlme_3.1-137
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 pillar_1.3.1 compiler_3.5.1 plyr_1.8.4 tools_3.5.1 digest_0.6.18
[7] tibble_2.1.1 gtable_0.3.0 lattice_0.20-38 pkgconfig_2.0.2 rlang_0.3.4 Matrix_1.2-17
[13] ggsci_2.9 rstudioapi_0.10 cmprsk_2.2-7 yaml_2.2.0 mvtnorm_1.0-10 expm_0.999-4
[19] xfun_0.6 gridExtra_2.3 knitr_1.22 withr_2.1.2 survMisc_0.5.5 generics_0.0.2
[25] grid_3.5.1 tidyselect_0.2.5 data.table_1.12.2 glue_1.3.1 KMsurv_0.1-5 R6_2.4.0
[31] km.ci_0.5-2 purrr_0.3.2 tidyr_0.8.3 scales_1.0.0 backports_1.1.4 splines_3.5.1
[37] assertthat_0.2.1 xtable_1.8-3 colorspace_1.4-1 labeling_0.3 lazyeval_0.2.2 munsell_0.5.0
[43] broom_0.5.2 crayon_1.3.4 zoo_1.8-5
是时候发出呼噜声了。用 purrr
就可以完成你想要的。您可以阅读有关制作 ggplot2 purrr
here and more examples here.
首先,我们需要使用 tidyr::gather
将您的数据转换为长格式。除了变量 1、2、3、4 之外,我们将保留数据框中的所有内容。他们会融化的。
library(tidyr)
library(dplyr)
library(purrr)
data %>%
gather(num, variable, -sample_id, -sex,
-visit_number, -age_at_enrollment,
-follow_up, -status) %>%
mutate(num2 = num) %>% # We'll need this column later for the titles
as_tibble() -> long_data
# A tibble: 2,028 x 8
sample_id sex visit_number age_at_enrollment follow_up status num variable
<fct> <fct> <fct> <dbl> <dbl> <fct> <chr> <int>
1 sample_0001 Female 1 56.7 0 1 variable1 0
2 sample_0001 Female 2 57.7 0.920 1 variable1 0
3 sample_0001 Female 3 58.6 1.90 1 variable1 0
4 sample_0001 Female 4 59.7 2.97 2 variable1 0
5 sample_0001 Female 5 60.7 4.01 1 variable1 0
6 sample_0001 Female 6 61.7 4.99 1 variable1 0
7 sample_0002 Female 1 55.9 0 1 variable1 1
8 sample_0002 Female 2 56.9 1.04 1 variable1 1
9 sample_0002 Female 3 58.0 2.15 1 variable1 1
10 sample_0002 Female 4 59.0 3.08 1 variable1 1
# ... with 2,018 more rows
现在我们需要将我们的长数据帧转换为嵌套数据帧 map
!使用 ggsurvplot
准确 — 此函数不支持在 nest()
.
期间创建的 tibbles
long_data %>%
group_by(num) %>%
nest() %>%
mutate(
# Run survfit() for every variable
fit_f = map(data, ~survfit(Surv(follow_up, as.numeric(status)) ~ (sex + variable), data = .)),
# Create survplot for every variable and survfit
plots = map2(fit_f, data, ~ggsurvplot(.x,
as.data.frame(.y), # Important! convert from tibble to data.frame
pval = TRUE,
conf.int = TRUE,
facet.by = "sex",
surv.median.line = "hv",
break.time.by = 1,
ggtheme = theme_bw(),
palette = "aaas",
xlab = "Time (years)",
ylab = "Death probability") +
ggtitle(paste0("This is plot of ", .y$num2)) + # Add a title
theme(legend.position = "bottom"))) -> plots
现在您可以 return 通过键入以下内容来绘制您的图:
plots$plots[[1]]
plots$plots[[2]]
plots$plots[[3]]
plots$plots[[4]] # plotted below
并使用 map2()
保存所有绘图
map2(paste0(unique(long_data$num), ".pdf"), plots$plots, ggsave)
更新
不幸的是,我不知道如何更改图例标签。我可以建议的唯一解决方案如下。请记住 plots$plots[[…]]
是 ggplot
object,因此您可以在之后更改所有内容。例如,要更改图例标签,我只需要添加 scale_fill_discrete
和 scale_color_discrete
。标题、实验室、主题等也可以这样做。
library(ggsci) # to add aaas color palette
plots$plots[[3]] +
labs(title = "Variable 3",
subtitle = "You just have to be the best") +
ggsci::scale_color_aaas(guide = F) +
ggsci::scale_fill_aaas(label = LETTERS[1:3])
我正在尝试使用 ggsurvplot_facet() 函数根据变量性别绘制多个变量的生存曲线。当我将我的代码应用于单个拟合模型时,它工作正常。但是,当我尝试在一个函数或一个 for 循环中使用相同的代码时,它无法绘制所有应该绘制的生存曲线并且出现 returns 错误。如果它允许输入生存元素列表,我会在 ggsurvplot_facet() 本身中执行此绘图,就像 ggsurvplot() 一样,但 ggsurvplot_facet() 只允许单个生存元素一次。
我是 运行 我在 2018 MacBook Pro 中 运行 我的代码 Mac OS High Sierra。
考虑以下数据集:http://s000.tinyupload.com/index.php?file_id=01704535336107726906
它包含对 100 名受试者和 4 个不同变量的多次访问的观察结果。其中两个变量(variable1 和 variable2)可以有两个不同的值(0 或 1),另外两个变量(variable3 和 variable4)可以有三个不同的值(0、1 或 2)。
我已经开始使用可以具有两个不同值的那些,并且我编写了以下代码:
# Load libraries
require(mgcv)
require(msm)
library(dplyr)
library(grDevices)
library(survival)
library(survminer)
# Set working directory
dirname<-dirname(rstudioapi::getSourceEditorContext()$path)
setwd(dirname)
load("ggsurvplot_facet_error.rda")
fit_test <- survfit(
Surv(follow_up, as.numeric(status)) ~ (sex + variable1), data = data)
plot_test <- ggsurvplot_facet(fit_test,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = c("A", "B")
)
plot_test
此代码运行良好并生成以下图:
但是,当我尝试将此代码转换为函数或 FOR 循环,以便将相同的代码应用于变量 1 和变量 2 时,绘图的 color/palette 部分总是出错步骤。
# Variables_with_2_categories: variable1 and variable2
two <- c("variable1", "variable2")
## TEST #1: USING A FUNCTION
fit_plot_function <- function(x) {
# FIT part of the function
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
sex + eval(as.name(paste0(two.i))), data = data)
# PLOT part of the function
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
fit_plot_function(two)
# Warning message:
# Now, to change color palette, use the argument palette=
# 'eval(as.name(paste0(two.i)))' instead of color = 'eval(as.name(paste0(two.i)))'
print(plot_temp)
# Error in grDevices::col2rgb(colour, TRUE) :
# invalid color name 'eval(as.name(paste0(two.i)))'
当它评估用向量解析的变量名称时,它似乎无法识别变量名称。对于 FOR 循环,情况完全相同:
## TEST #2: USING A FOR LOOP
n.two <- length(two)
for(i in 1:n.two) {
two.i <- two[i]
fit_temp <- survfit(Surv(as.numeric(follow_up), as.numeric(status)) ~
(sex + eval(as.name(paste0(two.i)))), data = data)
plot_temp <- ggsurvplot_facet(fit_temp,
data = data,
pval = TRUE,
conf.int = TRUE,
surv.median.line = "hv", # Specify median survival
break.time.by = 1,
facet.by = "sex",
ggtheme = theme_bw(), # Change ggplot2 theme
palette = "aaas",
legend = "bottom",
xlab = "Time (years)",
ylab = "Death probability",
panel.labs = list(sex_recoded=c("Male", "Female")),
legend.labs = rep(c("A", "B"),2)
)
}
print(plot_temp)
# ERROR: Now, to change color palette, use the argument palette= 'eval(as.name(paste0(two.i)))'
# instead of color = 'eval(as.name(paste0(two.i)))
作为附加评论,如果我可以将相同的代码应用于同时具有两个、两个或三个不同值的变量,而不是必须为每个变量应用不同的函数,那就太好了他们。
非常感谢您的帮助,
此致,
Yatrosin
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] survminer_0.4.3.999 ggpubr_0.2 magrittr_1.5 ggplot2_3.1.1 survival_2.44-1.1
[6] dplyr_0.8.0.1 msm_1.6.7 mgcv_1.8-27 nlme_3.1-137
loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 pillar_1.3.1 compiler_3.5.1 plyr_1.8.4 tools_3.5.1 digest_0.6.18
[7] tibble_2.1.1 gtable_0.3.0 lattice_0.20-38 pkgconfig_2.0.2 rlang_0.3.4 Matrix_1.2-17
[13] ggsci_2.9 rstudioapi_0.10 cmprsk_2.2-7 yaml_2.2.0 mvtnorm_1.0-10 expm_0.999-4
[19] xfun_0.6 gridExtra_2.3 knitr_1.22 withr_2.1.2 survMisc_0.5.5 generics_0.0.2
[25] grid_3.5.1 tidyselect_0.2.5 data.table_1.12.2 glue_1.3.1 KMsurv_0.1-5 R6_2.4.0
[31] km.ci_0.5-2 purrr_0.3.2 tidyr_0.8.3 scales_1.0.0 backports_1.1.4 splines_3.5.1
[37] assertthat_0.2.1 xtable_1.8-3 colorspace_1.4-1 labeling_0.3 lazyeval_0.2.2 munsell_0.5.0
[43] broom_0.5.2 crayon_1.3.4 zoo_1.8-5
是时候发出呼噜声了。用 purrr
就可以完成你想要的。您可以阅读有关制作 ggplot2 purrr
here and more examples here.
首先,我们需要使用 tidyr::gather
将您的数据转换为长格式。除了变量 1、2、3、4 之外,我们将保留数据框中的所有内容。他们会融化的。
library(tidyr)
library(dplyr)
library(purrr)
data %>%
gather(num, variable, -sample_id, -sex,
-visit_number, -age_at_enrollment,
-follow_up, -status) %>%
mutate(num2 = num) %>% # We'll need this column later for the titles
as_tibble() -> long_data
# A tibble: 2,028 x 8
sample_id sex visit_number age_at_enrollment follow_up status num variable
<fct> <fct> <fct> <dbl> <dbl> <fct> <chr> <int>
1 sample_0001 Female 1 56.7 0 1 variable1 0
2 sample_0001 Female 2 57.7 0.920 1 variable1 0
3 sample_0001 Female 3 58.6 1.90 1 variable1 0
4 sample_0001 Female 4 59.7 2.97 2 variable1 0
5 sample_0001 Female 5 60.7 4.01 1 variable1 0
6 sample_0001 Female 6 61.7 4.99 1 variable1 0
7 sample_0002 Female 1 55.9 0 1 variable1 1
8 sample_0002 Female 2 56.9 1.04 1 variable1 1
9 sample_0002 Female 3 58.0 2.15 1 variable1 1
10 sample_0002 Female 4 59.0 3.08 1 variable1 1
# ... with 2,018 more rows
现在我们需要将我们的长数据帧转换为嵌套数据帧 map
!使用 ggsurvplot
准确 — 此函数不支持在 nest()
.
tibbles
long_data %>%
group_by(num) %>%
nest() %>%
mutate(
# Run survfit() for every variable
fit_f = map(data, ~survfit(Surv(follow_up, as.numeric(status)) ~ (sex + variable), data = .)),
# Create survplot for every variable and survfit
plots = map2(fit_f, data, ~ggsurvplot(.x,
as.data.frame(.y), # Important! convert from tibble to data.frame
pval = TRUE,
conf.int = TRUE,
facet.by = "sex",
surv.median.line = "hv",
break.time.by = 1,
ggtheme = theme_bw(),
palette = "aaas",
xlab = "Time (years)",
ylab = "Death probability") +
ggtitle(paste0("This is plot of ", .y$num2)) + # Add a title
theme(legend.position = "bottom"))) -> plots
现在您可以 return 通过键入以下内容来绘制您的图:
plots$plots[[1]]
plots$plots[[2]]
plots$plots[[3]]
plots$plots[[4]] # plotted below
并使用 map2()
map2(paste0(unique(long_data$num), ".pdf"), plots$plots, ggsave)
更新
不幸的是,我不知道如何更改图例标签。我可以建议的唯一解决方案如下。请记住 plots$plots[[…]]
是 ggplot
object,因此您可以在之后更改所有内容。例如,要更改图例标签,我只需要添加 scale_fill_discrete
和 scale_color_discrete
。标题、实验室、主题等也可以这样做。
library(ggsci) # to add aaas color palette
plots$plots[[3]] +
labs(title = "Variable 3",
subtitle = "You just have to be the best") +
ggsci::scale_color_aaas(guide = F) +
ggsci::scale_fill_aaas(label = LETTERS[1:3])