使用 ggplot2 随时间可视化情绪

Visualizing sentiment over time with ggplot2

我正在尝试将情绪随着时间的推移可视化,类似于 post

我的数据集如下所示:

head(Visualizing_sentiment)
date
<S3: POSIXct>
sentiment
<chr>
2011-12-01  neutral         
2011-12-01  negative            
2011-12-01  negative            
2011-12-01  negative            
2011-12-01  negative            
2011-12-01  negative

我运行以下可视化:

Visualizing_sentiment %>% 
    gather(sentiment, values, -date) %>%
    ggplot() +
    geom_bar(aes(y = values, x = date, fill = sentiment), stat = "identity")

但我想在 x 轴上以 month/year 格式格式化日期变量,因此我尝试将日期变量的格式更改为 Date class,如下所示:

lubridate::ymd('20111201')
lubridate::ymd(20111201)
lubridate::ymd(Visualizing_sentiment$date)

虽然日期变量的格式发生了变化,但当我 运行 以下内容时,我收到了图表错误:

Visualizing_sentiment %>% 
    gather(sentiment, values, -date) %>%
    ggplot() +
    scale_x_date(date_breaks = "1 month", date_labels =  "%b %Y") +
    theme(axis.text.x=element_text(angle=60, hjust=1)) +
    geom_bar(aes(y = values, x = date, fill = sentiment), stat = "identity")

理想情况下,我想生成一个条形图,显示 month/year 的负面、正面和中性情绪的份额。

多亏了下面的建议,我 运行 以下内容运行良好:

Visualizing_sentiment %>%
  mutate(date = as.Date(date))%>%
  count(sentiment, date)%>%
  ggplot(aes(x = date, y = n, fill = sentiment))+
  geom_col() +
  #geom_col(position = "dodge")+
  scale_fill_manual(values = c("positive" = "green", 
                               "negative" = "red", 
                               "neutral"= "black"))+
    scale_x_date(date_labels = "%b-%y")+
    facet_wrap(~ year(date))
  theme_classic()

要绘制一段时间内的情绪,您需要一个日期列和一个情绪列。然后您可以使用 count(sentiment, date) 按日期计算情绪,然后您可以沿 x 轴绘制日期,n 向上绘制 y 轴,并按情绪填充。

如果您想要堆叠条形图,请从 geom_col()

中删除 position = "dodge"

library(lubridate)
library(tidyverse)

data <- tibble(
  sentiment = c("positive", "positive", "negative", "negative", "neutral", "neutral",
                "neutral", "positive", "negative", "neutral", "neutral", "negative",
                "negative", "neutral", "neutral", "positive"),
  date = c("2010-02-03", "2010-02-03", "2010-02-04", "2010-02-04", "2010-02-04", "2010-02-05",
           "2010-02-05", "2010-02-05", "2010-02-05", "2010-02-05", "2010-02-03", "2010-02-04",
           "2010-02-04", "2010-02-05", "2010-02-04", "2010-02-04")
)
data %>%
  mutate(date = as.Date(date))%>%
  count(sentiment, date)%>%
  ggplot(aes(x = date, y = n, fill = sentiment))+
  geom_col(position = "dodge")+
  scale_fill_manual(values = c("positive" = "green", 
                               "negative" = "red", 
                               "neutral"= "black"))+
  scale_x_date(date_labels = "%b-%Y")+
  theme_bw()

我为你加了一个scale_x_date()。 %b 代表月份缩写,%Y 代表年份。如果您只想说“10”而不是“2010”,您可以选择 %y。

建议如果这样做多年,将是一个额外的步骤,并使用 facet_wrap() 将每一年显示为单独的图表。你可以这样做:

previous_plotting_code+
    facet_wrap(~ year(date))

year() 函数将从日期变量中挑选出年份,由 lubridate 提供。