在绘制时间序列热图时提高 ggplotly 的性能

Question

我正在构建 interactive time-series heatmap in R using Plotly and Shiny. As part of this process, I'm re-coding heatmap values from continuous to ordinal format - so I have a heatmap where six colours represent specific count categories, and those categories are created from aggregated count values. However, this causes a major performance issue with the speed of the creation of heatmap using ggplotly(). I've traced it to the tooltip() function from Plotly which renders interactive boxes. Labels data from my heatmap somehow overload this function in a way that it performs very slowly, even if I just add a single label component to the tooltip(). I'm using a processed subset of COVID-19 outbreak data from Johns Hopkins CSSE repository. Here is a simplified heatmap code, which also uses The Simpsons colour theme from ggsci:

#Load packages
library(shiny)
library(plotly)
library(tidyverse)
library(RCurl)
library(ggsci)

#Read example data from Gist
confirmed <- read_csv("https://gist.githubusercontent.com/GeekOnAcid/5638e37c688c257b1c381a15e3fb531a/raw/80ba9704417c61298ca6919343505725b8b162a5/covid_selected_europe_subset.csv")

#Wrap ggplot of time-series heatmap in ggplotly, call "tooltip"  
ggplot_ts_heatmap <- confirmed %>%
  ggplot(aes(as.factor(date), reorder(`Country/Region`,`cases count`), 
             fill=cnt.cat, label = `cases count`, label2 = as.factor(date), 
             text = paste("country:", `Country/Region`))) + 
  geom_tile(col=1) +
  theme_bw(base_line_size = 0, base_rect_size = 0, base_size = 10) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),legend.title = element_blank()) +
  scale_fill_manual(labels = levels(confirmed$cnt.cat),
                    values = pal_simpsons("springfield")(7)) +
  labs(x = "", y = "")
ggplotly(ggplot_ts_heatmap, tooltip = c("text","label","label2"))

一旦 tooltip = c("text","label","label2") 减少（例如减少到 tooltip = c("text")），性能就会提高。现在，我知道延迟不是 "massive"，但我正在将其与 Shiny 应用程序集成。一旦它与 Shiny 集成并使用更多数据进行扩展，它就会非常、非常、非常慢。我什至没有在 tooltip 中显示所有变量，它仍然很慢 - 当您单击 'confirmed' 个案例时，您可以在 the current version of the app 中看到它。

有什么建议吗？我考虑过像 d3heatmap, heatmaply and shinyHeatmaply 这样的替代交互式热图包，但所有这些解决方案都更适合相关热图，它们缺少 ggplot 的自定义选项。

Answer 1

如果你将它重写为 "pure" plotly（没有 ggplotly 转换），它会更快。甚至大约3000次。这是一个非常小的基准测试的结果：

Unit: milliseconds
 expr       min        lq       mean     median        uq       max neval
    a 9929.8299 9929.8299 9932.49130 9932.49130 9935.1527 9935.1527     2
    b    3.1396    3.1396    3.15665    3.15665    3.1737    3.1737     2

ggplotly 慢得多的原因是它无法将输入识别为热图并创建一个散点图，其中每个矩形都单独绘制，并具有所有必要的属性。如果将 ggplotly 或 plot_ly 的结果包装在 plotly_json().

中，则可以查看结果 JSON

您还可以检查图的 object.size，您会看到 ggplotly 对象在 4616.4 Kb 左右，plotly-热图只有 40.4 Kb 大。

df_colors = data.frame(range=c(0:13), colors=c(0:13))
color_s <- setNames(data.frame(df_colors$range, df_colors$colors), NULL)
for (i in 1:14) {
  color_s[[2]][[i]] <- pal_simpsons("springfield")(13)[[(i + 1) / 2]]
  color_s[[1]][[i]] <-  i / 14 - (i %% 2) / 14
}

plot_ly(data = confirmed, text = text) %>%
  plotly::add_heatmap(x = ~as.factor(date), 
                      y = ~reorder(`Country/Region`, `cases count`),
                      z = ~as.numeric(factor(confirmed$`cnt.cat`, ordered = T, 
                                             levels = unique(confirmed$`cnt.cat`))),
                      xgap = 0.5,
                      ygap = 0.5,
                      colorscale = color_s,
                      colorbar = list(tickmode='array',
                                      title = "Cases",
                                      tickvals=c(1:7),
                                      ticktext=levels(factor(x = confirmed$`cnt.cat`,
                                                             levels = unique(confirmed$`cnt.cat`),
                                                             ordered = TRUE)), len=0.5),
                      text = ~paste0("country: ", `Country/Region`, "<br>",
                                    "Number of cases: ", `cases count`, "<br>",
                                    "Category:  ", `cnt.cat`),
                      hoverinfo ="text"
  ) %>% 
  layout(plot_bgcolor='black',
         xaxis = list(title = ""),
         yaxis = list(title = ""));

在绘制时间序列热图时提高 ggplotly 的性能

Improve performance of ggplotly when plotting time-series heatmap

r

heatmap

ggplot2

shiny

ggplotly