结构主题模型(stm 包)使用 plot 函数绘制百分比值
Structural topic model (stm package) plot percentage values using plot function
stm教程第18页
https://cran.r-project.org/web/packages/stm/vignettes/stmVignette.pdf
绘制了预期的主题比例
plot(poliblogPrevFit, type = "summary", xlim = c(0, .3))
在哪里
poliblogPrevFit
poliblogPrevFit <- stm(documents = out$documents, vocab = out$vocab,
+ K = 20, prevalence =~ rating + s(day),
+ max.em.its = 75, data = out$meta,
+ init.type = "Spectral")
我试图找出如何在绘图中显示百分比值,我正在尝试添加绘图函数的 text
但不起作用..如何将值添加到图中每个条的右侧?
为了您的想法使用您需要的概率,首先,由 stm 主题模型生成的特定矩阵:Theta。它基本上向您展示了文档属于某个主题的概率。其次,你需要你的主题标签(如果你想坚持主题 1、主题 2 等),将你的价值观与之相关联。
我将代码与我自己的数据放在一起,但它也应该适用于您的数据。请记住,也许可以更改一些内容以使代码适用于您自己的特定数据。
## Put labels in a vector
labels <- c("Buffy", "Vampire", "Slayer", "Mr. Pointy")
## Include here your own labels, you probably have more than four
## Extract theta from the stm-model
df <- data.frame(labels)
proportion <- as.data.frame(colSums(stm_topics$theta/nrow(stm_topics$theta)))
df <- cbind(df, proportion)
colnames(df) <- c("Labels", "Probability")
## Sort the dataframe
df <- df[order(-proportion), ]
df$Labels <- factor(df$Labels, levels = rev(df$Labels))
df$Probability <- as.numeric(df$Probability)
df$Probability <- round(df$Probability, 4)
## Plot graph
ggplot(df, aes(x = Labels, y = Probability)) +
geom_bar(stat = "identity") +
scale_y_continuous(breaks = c(0, 0.15), limits = c(0, 0.15), expand = c(0, 0)) + #change breaks and limits as you need
coord_flip() +
geom_text(aes(label = scales::percent(Probability)), #Scale in percent
hjust = -0.25, size = 4,
position = position_dodge(width = 1),
inherit.aes = TRUE) +
theme(panel.border = element_blank())
stm教程第18页
https://cran.r-project.org/web/packages/stm/vignettes/stmVignette.pdf
绘制了预期的主题比例
plot(poliblogPrevFit, type = "summary", xlim = c(0, .3))
在哪里 poliblogPrevFit
poliblogPrevFit <- stm(documents = out$documents, vocab = out$vocab,
+ K = 20, prevalence =~ rating + s(day),
+ max.em.its = 75, data = out$meta,
+ init.type = "Spectral")
我试图找出如何在绘图中显示百分比值,我正在尝试添加绘图函数的 text
但不起作用..如何将值添加到图中每个条的右侧?
为了您的想法使用您需要的概率,首先,由 stm 主题模型生成的特定矩阵:Theta。它基本上向您展示了文档属于某个主题的概率。其次,你需要你的主题标签(如果你想坚持主题 1、主题 2 等),将你的价值观与之相关联。 我将代码与我自己的数据放在一起,但它也应该适用于您的数据。请记住,也许可以更改一些内容以使代码适用于您自己的特定数据。
## Put labels in a vector
labels <- c("Buffy", "Vampire", "Slayer", "Mr. Pointy")
## Include here your own labels, you probably have more than four
## Extract theta from the stm-model
df <- data.frame(labels)
proportion <- as.data.frame(colSums(stm_topics$theta/nrow(stm_topics$theta)))
df <- cbind(df, proportion)
colnames(df) <- c("Labels", "Probability")
## Sort the dataframe
df <- df[order(-proportion), ]
df$Labels <- factor(df$Labels, levels = rev(df$Labels))
df$Probability <- as.numeric(df$Probability)
df$Probability <- round(df$Probability, 4)
## Plot graph
ggplot(df, aes(x = Labels, y = Probability)) +
geom_bar(stat = "identity") +
scale_y_continuous(breaks = c(0, 0.15), limits = c(0, 0.15), expand = c(0, 0)) + #change breaks and limits as you need
coord_flip() +
geom_text(aes(label = scales::percent(Probability)), #Scale in percent
hjust = -0.25, size = 4,
position = position_dodge(width = 1),
inherit.aes = TRUE) +
theme(panel.border = element_blank())