创建一个图表,显示每个组的日期分布范围,并在 R 中选择类别

create a graph showing the range over which the date are spread by each group with selection of category in R

我是 R 的新手,我在 R 中有一个 data.table,看起来像:

> dt <- data.table(category = rep(c("0001", "0002", "0003"), c(10,3,4)), 
                  grp = c("01", "03", "04", "06", "81", "82", "83", "84", "85", "86", 
                          "01", "02", "03",
                          "01", "02", "03", "04"),
                  min_date = c("2012-04-18", "2012-04-18", "2012-04-23", "2012-04-10", "2012-04-05", 
                               "2012-04-13", "2012-04-10", "2012-04-07", "2012-04-19", "2012-04-05",
                               "2012-04-04", "2012-04-06", "2012-04-25", "2012-04-19", "2012-04-05",
                               "2012-04-26", "2012-04-27"),
                  max_date = c("2012-05-23", "2012-05-19", "2012-05-19", "2012-04-24", "2012-05-23", 
                               "2012-05-09", "2012-05-19", "2012-05-24", "2012-05-22", "2012-05-23",
                               "2012-05-12", "2012-05-11", "2012-05-17", "2012-05-22", "2012-05-22",
                               "2012-05-19", "2012-05-17"),
                  hours_played = c(426, 381, 318, 168, 583, 314, 477, 568, 398, 582, 458, 429, 268, 
                                   395, 568, 276, 238))

> dt
    category grp   min_date   max_date hours_played
 1:     0001  01 2012-04-18 2012-05-23          426
 2:     0001  03 2012-04-18 2012-05-19          381
 3:     0001  04 2012-04-23 2012-05-19          318
 4:     0001  06 2012-04-10 2012-04-24          168
 5:     0001  81 2012-04-05 2012-05-23          583
 6:     0001  82 2012-04-13 2012-05-09          314
 7:     0001  83 2012-04-10 2012-05-19          477
 8:     0001  84 2012-04-07 2012-05-24          568
 9:     0001  85 2012-04-19 2012-05-22          398
10:     0001  86 2012-04-05 2012-05-23          582
11:     0002  01 2012-04-04 2012-05-12          458
12:     0002  02 2012-04-06 2012-05-11          429
13:     0002  03 2012-04-25 2012-05-17          268
14:     0003  01 2012-04-19 2012-05-22          395
15:     0003  02 2012-04-05 2012-05-22          568
16:     0003  03 2012-04-26 2012-05-19          276
17:     0003  04 2012-04-27 2012-05-17          238

我想创建一个可视化效果,以显示每个组 grp 的活动范围以及相应的 hours_played 。下拉列表中应该有 select category 的功能。

在 select 从可用类别的下拉列表中选择一个类别时,图表应显示属于该类别的所有组都处于活动状态的日期范围,以及该类别的播放时间 alongside/inside。时间轴必须是X轴,时间范围可以是10天。

类似这样的事情:我的绘画技巧很差,但只是想了解我想要什么。

我如何在 R 中执行此操作。

我建议你这个情节:

library(ggplot2)
library(scales)
ggplot(dt) +
    aes(y = grp, x = as.Date(min_date)) +
    geom_segment(aes(yend = grp, 
                     xend = as.Date(max_date), 
                     color = grp), 
                 size = 5,
                 show.legend = FALSE) +

    geom_text(aes(label = paste0('grp', grp)), 
              nudge_x = 3,
              size = 3) +

    geom_text(aes(label = paste0(hours_played, ' h'), 
                  x = as.Date(max_date)), 
              nudge_x = 1.5,
              size = 2) +

    facet_grid(category ~ ., scales = 'free_y', labeller = label_both) +
    scale_x_date('Date', date_breaks = '10 days', expand = c(0, 2)) +
    scale_color_brewer(palette = 'Set3') +
    theme_bw() +
    theme(axis.line.y = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank())

重要的位是三个 geom_*(一个用于片段,两个用于文本)和 faceting(将情节划分为三个子情节,基于 category)

更新:

要在情节中添加互动,我们需要一个反应性环境。最简单的是 .Rmd 文档。

将其粘贴到一个新的 .Rmd 文件中,然后 'Run' 它:

---
output: html_document
runtime: shiny

---

```{r data, echo = F}
dt <- data.frame(category = rep(c("0001", "0002", "0003"), c(10,3,4)), 
                  grp = c("01", "03", "04", "06", "81", "82", "83", "84", "85", "86", 
                          "01", "02", "03",
                          "01", "02", "03", "04"),
                  min_date = c("2012-04-18", "2012-04-18", "2012-04-23", "2012-04-10", "2012-04-05", 
                               "2012-04-13", "2012-04-10", "2012-04-07", "2012-04-19", "2012-04-05",
                               "2012-04-04", "2012-04-06", "2012-04-25", "2012-04-19", "2012-04-05",
                               "2012-04-26", "2012-04-27"),
                  max_date = c("2012-05-23", "2012-05-19", "2012-05-19", "2012-04-24", "2012-05-23", 
                               "2012-05-09", "2012-05-19", "2012-05-24", "2012-05-22", "2012-05-23",
                               "2012-05-12", "2012-05-11", "2012-05-17", "2012-05-22", "2012-05-22",
                               "2012-05-19", "2012-05-17"),
                  hours_played = c(426, 381, 318, 168, 583, 314, 477, 568, 398, 582, 458, 429, 268, 
                                   395, 568, 276, 238))
```

```{r graph, echo = F}
library(ggplot2)
library(scales)

selectInput('category','Choose the category:', choices = unique(dt$category))

dt_filtered <- reactive({
  dt[dt$category == input$category, ]
})

renderPlot({
  ggplot(dt_filtered()) +
    aes(y = grp, x = as.Date(min_date)) +
    geom_segment(aes(yend = grp, 
                     xend = as.Date(max_date), 
                     color = grp), 
                 size = 5,
                 show.legend = FALSE) +

    geom_text(aes(label = paste0('grp', grp)), 
              nudge_x = 3,
              size = 3) +

    geom_text(aes(label = paste0(hours_played, ' h'), 
                  x = as.Date(max_date)), 
              nudge_x = 1.5,
              size = 2) +
    scale_x_date('Date', date_breaks = '10 days', expand = c(0, 2)) +
    scale_color_brewer(palette = 'Set3') +
    theme_bw() +
    theme(axis.line.y = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks.y = element_blank())

})
```