使用 tidytext 将字数大小作为层添加到共现网络图表上的节点大小
Adding word count size as a layer to the node size on a cooccurrence network chart using tidytext
我有兴趣使用与 section 8.2.2 David Robinson and Julia Silge's Tidy Text mining book, such as this chart, except that I would like to have the sizes of the nodes change depending on how many times the term shows up in the data:
上所示类似的共现网络图表
上面的图表是用下面的代码建立的:
library(tidytext)
library(tidyverse)
library(widyr)
library(igraph)
library(ggraph)
library(jsonlite)
metadata <- fromJSON("https://data.nasa.gov/data.json")
nasa_keyword <- data_frame(id = metadata$dataset$`_id`$`$oid`,
keyword = metadata$dataset$keyword) %>%
unnest(keyword)
keyword_cors <- nasa_keyword %>%
group_by(keyword) %>%
filter(n() >= 50) %>%
pairwise_cor(keyword, id, sort = TRUE, upper = FALSE)
set.seed(1234)
keyword_cors %>%
filter(correlation > .6) %>%
graph_from_data_frame() %>%
ggraph(layout = "fr") +
geom_edge_link(aes(edge_alpha = correlation, edge_width = correlation), edge_colour = "royalblue") +
geom_node_point(size = 5) +
geom_node_text(aes(label = name), repel = TRUE,
point.padding = unit(0.2, "lines")) +
theme_void()
我一直在研究 geom_node_point(aes(size = ??))
,但我不知道如何配置代码来做到这一点。对我来说,部分问题是函数 graph_from_data_frame()
将数据框变成了一个看起来相当复杂的对象。
I would like to have the sizes of the nodes change depending on how
many times the term shows up in the data
你可以
set.seed(1234)
keyword_cors %>%
filter(correlation > .6) %>%
graph_from_data_frame(vertices = nasa_keyword %>% count(keyword) %>% filter(n >= 50)) %>%
ggraph(layout = "fr") +
geom_edge_link(aes(edge_alpha = correlation, edge_width = correlation),
edge_colour = "royalblue") +
geom_node_point(aes(size = n)) + scale_size(range = c(1,10)) +
geom_node_text(aes(label = name), repel = TRUE,
point.padding = unit(0.2, "lines")) +
theme_void()
这给你这样的东西:
vertices = nasa_keyword %>% count(keyword) %>% filter(n >= 50)
添加
图的节点信息,更具体地说:节点 ID(第一个
列)和出现次数 n
(第二列)。
aes(size = n)
将此信息映射到节点大小。
scale_size(range = c(1,10))
让我们定义最小值和
最大磅值。
我有兴趣使用与 section 8.2.2 David Robinson and Julia Silge's Tidy Text mining book, such as this chart, except that I would like to have the sizes of the nodes change depending on how many times the term shows up in the data:
上面的图表是用下面的代码建立的:
library(tidytext)
library(tidyverse)
library(widyr)
library(igraph)
library(ggraph)
library(jsonlite)
metadata <- fromJSON("https://data.nasa.gov/data.json")
nasa_keyword <- data_frame(id = metadata$dataset$`_id`$`$oid`,
keyword = metadata$dataset$keyword) %>%
unnest(keyword)
keyword_cors <- nasa_keyword %>%
group_by(keyword) %>%
filter(n() >= 50) %>%
pairwise_cor(keyword, id, sort = TRUE, upper = FALSE)
set.seed(1234)
keyword_cors %>%
filter(correlation > .6) %>%
graph_from_data_frame() %>%
ggraph(layout = "fr") +
geom_edge_link(aes(edge_alpha = correlation, edge_width = correlation), edge_colour = "royalblue") +
geom_node_point(size = 5) +
geom_node_text(aes(label = name), repel = TRUE,
point.padding = unit(0.2, "lines")) +
theme_void()
我一直在研究 geom_node_point(aes(size = ??))
,但我不知道如何配置代码来做到这一点。对我来说,部分问题是函数 graph_from_data_frame()
将数据框变成了一个看起来相当复杂的对象。
I would like to have the sizes of the nodes change depending on how many times the term shows up in the data
你可以
set.seed(1234)
keyword_cors %>%
filter(correlation > .6) %>%
graph_from_data_frame(vertices = nasa_keyword %>% count(keyword) %>% filter(n >= 50)) %>%
ggraph(layout = "fr") +
geom_edge_link(aes(edge_alpha = correlation, edge_width = correlation),
edge_colour = "royalblue") +
geom_node_point(aes(size = n)) + scale_size(range = c(1,10)) +
geom_node_text(aes(label = name), repel = TRUE,
point.padding = unit(0.2, "lines")) +
theme_void()
这给你这样的东西:
vertices = nasa_keyword %>% count(keyword) %>% filter(n >= 50)
添加 图的节点信息,更具体地说:节点 ID(第一个 列)和出现次数n
(第二列)。aes(size = n)
将此信息映射到节点大小。scale_size(range = c(1,10))
让我们定义最小值和 最大磅值。