Wordcloud 基于 R 中的连续元数据显示颜色

Question

我正在创建一个词云，其中词的大小基于频率，但我希望将词的颜色映射到第三个变量（压力，这是与每个词相关的压力量词、数值或连续变量）。

我尝试了以下方法，它只给了我两种不同的颜色（黄色和紫色），而我想要更光滑的颜色。我想要一些颜色范围，例如从绿色到红色的调色板。

df = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation", "email"),
n = c(20, 12, 4, 8, 10, 43),
stress = c(23, 30, 15, 40, 35, 15))
df = tbl_df(df) 
wordcloud(words = df$word, freq = df$n, col = df$stress)

有谁知道如何处理这种连续的元数据，并在压力上升时让单词的颜色平滑变化？谢谢！

Answer 1

这是一个可能的解决方案。您想使用 wordcloud2 包来完成您的任务。然后，我想你可以解决你的问题。由于我不知道你的真实数据，我创建了一个示例数据来演示原型。

如果你有很多词，我不确定用连续变量（压力）添加颜色是否是个好主意。您可以做的一件事是使用 cut() 创建一个新的组变量。通过这种方式，您可以减少图形中使用的颜色数量。在这里，我创建了一个名为 color 的新列，其中包含 viridis 包中的五种颜色。

当您使用 wordcloud2() 时，您只需提供两样东西。一个是数据，另一个是颜色。字体大小反映单词的频率，无需指定。

mydf = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation",
                           "email", "friends", "chat", "text", "deal",
                           "business", "promotion", "discount", "users", "family"),
                  n = c(20, 12, 4, 8, 10, 43, 33, 5, 47, 28, 12, 9, 50, 31, 22),
                  stress = c(23, 30, 15, 40, 35, 15, 30, 18, 10, 5, 29, 38, 45, 8, 3))


          word  n stress
1      calling 20     23
2      meeting 12     30
3   conference  4     15
4     contract  8     40
5  negotiation 10     35
6        email 43     15
7      friends 33     30
8         chat  5     18
9         text 47     10
10        deal 28      5
11    business 12     29
12   promotion  9     38
13    discount 50     45
14       users 31      8
15      family 22      3

library(dplyr)
library(wordcloud2)
library(viridis)

mutate(mydf, color = cut(stress, breaks = c(0, 10, 20, 30, 40, Inf),
             labels = c("#FDE725FF", "#73D055FF", "#1F968BFF",
                        "#2D708EFF", "#481567FF"),
             include.lowest = TRUE)) -> temp

wordcloud2(data = temp, color = temp$color)

Answer 2

或者更自动一些而不是指定确切的阈值和颜色：

library(RColorBrewer)
library(wordcloud2)

mydf = data.frame(word = c("calling", "meeting", "conference", "contract", "negotiation",
                       "email", "friends", "chat", "text", "deal",
                       "business", "promotion", "discount", "users", "family"),
              n = c(20, 12, 4, 8, 10, 43, 33, 5, 47, 28, 12, 9, 50, 31, 22),
              stress = c(23, 30, 15, 40, 35, 15, 30, 18, 10, 5, 29, 38, 45, 8, 3))

color_range_number <- length(unique(mydf$stress))
color <- colorRampPalette(brewer.pal(9,"Blues")[3:7])(color_range_number)[factor(mydf$stress)]

wordcloud2(mydf, color=color)

所以大小由'n'决定，深浅由'stress'决定。

[3:7]用于调整色阶范围。 1最亮，9最暗。

您可以通过以下方式检查其他调色板选项：

display.brewer.all()

Wordcloud 基于 R 中的连续元数据显示颜色

Wordcloud showing colour based on continous metadata in R

r

colors

text-mining

color-palette

word-cloud