分类数据的 R 数据可视化
R Data Visualization for categorical data
我正在寻找可视化分类数据的方法。
想象一下,我是一个狂热的观鸟者,我有一个鸟类列表,我想在俄勒冈州和爱达荷州这两个不同的州查看并拍摄它们的照片。
我正在寻找一种直观地表示进度的方法。
我的第一个想法是我想要一个类似 table 的东西,它的第一列是物种,接下来的两列是状态,然后是一个带有代表进度的颜色的分割方块。有点像对角线分割的热图,但我做空了。这是示例的模型。
欢迎提出其他建议。
这里是一个示例数据集:
progress <- read.table(header = TRUE, text = "
bird location action progress
osprey Oregon view completed
osprey Oregon photo completed
osprey Idaho view completed
osprey Idaho photo not_yet
white-tailed_kite Oregon view wait_till_spring
white-tailed_kite Oregon photo wait_till_spring
white-tailed_kite Idaho view not_present
white-tailed_kite Idaho photo not_present
bald_eagle Oregon view completed
bald_eagle Oregon photo completed
bald_eagle Idaho view completed
bald_eagle Idaho photo completed")
感谢您的建议!
三角形可能很难,可以使用自定义 glyphs/images 或通过在适当的位置绘制三角形多边形的函数来完成。
更简单地说,您可以只使用正方形:
ggplot(progress,
aes(x = as.numeric(location) + if_else(action == "view", -0.1, 0.1),
y = bird,
fill = progress)) +
geom_tile(height = 0.2, color = "white", size = 2) +
annotate("text", x = c(0.95, 1.05), y = 3.2,
label = c("view", "photo"), hjust = c(1,0)) +
scale_x_discrete(limits = unique(progress$location), name = "") +
scale_fill_manual(values = c("completed" = "olivedrab",
"not_present" = "gray70",
"not_yet" = "tomato4",
"wait_till_spring" = "lightskyblue")) +
theme_minimal()
可以做得更优雅,但希望这个解决方案能帮助您实现您正在寻找的设计。
## variables related to heatmap squares
sz.square = 0.6
spacer = 0.05
col = c(completed="forestgreen", not_present="gray70", not_yet="orangered4",
wait_till_spring="skyblue2")
## variables related to plot layout
sz.rowlabels = 3
sz.collabels = 0.2
sz.legend = 4
## plotting functions for heat map triangles
plot.action = c(
## plot "viewed"
view = function(x, y, col) {
polygon(
c(
x - sz.square/2 + spacer,
x + sz.square/2,
x + sz.square/2),
c(
y + sz.square/2,
y - sz.square/2 + spacer,
y + sz.square/2),
col=col)
},
## plot "photographed"
photo = function(x, y, col) {
polygon(
c(
x - sz.square/2,
x + sz.square/2 - spacer,
x - sz.square/2),
c(
y + sz.square/2 - spacer,
y - sz.square/2,
y - sz.square/2),
col=col)
})
xlim = c(1 - sz.square - sz.rowlabels,
length(levels(progress$location)) + sz.square + sz.legend)
ylim = c(length(levels(progress$bird)) + sz.square,
1 - sz.square - sz.collabels)
## initialize the plot
par(mar=c(1, 1, 1, 1))
plot(c(0,2), c(2,0), type="n", xlim=xlim, ylim=ylim,
main=NA, xlab=NA, ylab=NA, xaxt="n", yaxt="n",
asp=1)
## plot heat map
for (i in 1:nrow(progress)) {
plot.action[[progress$action[i]]](
as.integer(progress$location[i]),
as.integer(progress$bird[i]),
col = col[progress$progress[i]])
}
## add axix labels
text(xlim[1], 1:nlevels(progress$bird), levels(progress$bird), adj=0, cex=2)
text(1:nlevels(progress$location), ylim[2], levels(progress$location),
adj=c(0.5,0), cex=2)
## legend
text(xlim[2] - sz.legend/2, ylim[2], "Legend", cex=2)
sz.square = 0.25
x.legend = rep(xlim[2] - 5/8*sz.legend, nlevels(progress$progress) + 2)
y.legend = ylim[2] + 1:(nlevels(progress$progress) + 2) * 0.35 + 0.2
plot.action[["view"]](x.legend[2], y.legend[2], col="white")
plot.action[["photo"]](x.legend[1], y.legend[1], col="white")
rect(
x.legend[3:length(x.legend)] - sz.square/2,
y.legend[3:length(y.legend)] - sz.square/2,
x.legend[3:length(x.legend)] + sz.square/2,
y.legend[3:length(y.legend)] + sz.square/2,
col=col)
text(x.legend + sz.square, y.legend,
c("viewed", "photographed", levels(progress$progress)),
adj=0, cex=1.3)
我正在寻找可视化分类数据的方法。
想象一下,我是一个狂热的观鸟者,我有一个鸟类列表,我想在俄勒冈州和爱达荷州这两个不同的州查看并拍摄它们的照片。
我正在寻找一种直观地表示进度的方法。
我的第一个想法是我想要一个类似 table 的东西,它的第一列是物种,接下来的两列是状态,然后是一个带有代表进度的颜色的分割方块。有点像对角线分割的热图,但我做空了。这是示例的模型。
欢迎提出其他建议。
这里是一个示例数据集:
progress <- read.table(header = TRUE, text = "
bird location action progress
osprey Oregon view completed
osprey Oregon photo completed
osprey Idaho view completed
osprey Idaho photo not_yet
white-tailed_kite Oregon view wait_till_spring
white-tailed_kite Oregon photo wait_till_spring
white-tailed_kite Idaho view not_present
white-tailed_kite Idaho photo not_present
bald_eagle Oregon view completed
bald_eagle Oregon photo completed
bald_eagle Idaho view completed
bald_eagle Idaho photo completed")
感谢您的建议!
三角形可能很难,可以使用自定义 glyphs/images 或通过在适当的位置绘制三角形多边形的函数来完成。
更简单地说,您可以只使用正方形:
ggplot(progress,
aes(x = as.numeric(location) + if_else(action == "view", -0.1, 0.1),
y = bird,
fill = progress)) +
geom_tile(height = 0.2, color = "white", size = 2) +
annotate("text", x = c(0.95, 1.05), y = 3.2,
label = c("view", "photo"), hjust = c(1,0)) +
scale_x_discrete(limits = unique(progress$location), name = "") +
scale_fill_manual(values = c("completed" = "olivedrab",
"not_present" = "gray70",
"not_yet" = "tomato4",
"wait_till_spring" = "lightskyblue")) +
theme_minimal()
可以做得更优雅,但希望这个解决方案能帮助您实现您正在寻找的设计。
## variables related to heatmap squares
sz.square = 0.6
spacer = 0.05
col = c(completed="forestgreen", not_present="gray70", not_yet="orangered4",
wait_till_spring="skyblue2")
## variables related to plot layout
sz.rowlabels = 3
sz.collabels = 0.2
sz.legend = 4
## plotting functions for heat map triangles
plot.action = c(
## plot "viewed"
view = function(x, y, col) {
polygon(
c(
x - sz.square/2 + spacer,
x + sz.square/2,
x + sz.square/2),
c(
y + sz.square/2,
y - sz.square/2 + spacer,
y + sz.square/2),
col=col)
},
## plot "photographed"
photo = function(x, y, col) {
polygon(
c(
x - sz.square/2,
x + sz.square/2 - spacer,
x - sz.square/2),
c(
y + sz.square/2 - spacer,
y - sz.square/2,
y - sz.square/2),
col=col)
})
xlim = c(1 - sz.square - sz.rowlabels,
length(levels(progress$location)) + sz.square + sz.legend)
ylim = c(length(levels(progress$bird)) + sz.square,
1 - sz.square - sz.collabels)
## initialize the plot
par(mar=c(1, 1, 1, 1))
plot(c(0,2), c(2,0), type="n", xlim=xlim, ylim=ylim,
main=NA, xlab=NA, ylab=NA, xaxt="n", yaxt="n",
asp=1)
## plot heat map
for (i in 1:nrow(progress)) {
plot.action[[progress$action[i]]](
as.integer(progress$location[i]),
as.integer(progress$bird[i]),
col = col[progress$progress[i]])
}
## add axix labels
text(xlim[1], 1:nlevels(progress$bird), levels(progress$bird), adj=0, cex=2)
text(1:nlevels(progress$location), ylim[2], levels(progress$location),
adj=c(0.5,0), cex=2)
## legend
text(xlim[2] - sz.legend/2, ylim[2], "Legend", cex=2)
sz.square = 0.25
x.legend = rep(xlim[2] - 5/8*sz.legend, nlevels(progress$progress) + 2)
y.legend = ylim[2] + 1:(nlevels(progress$progress) + 2) * 0.35 + 0.2
plot.action[["view"]](x.legend[2], y.legend[2], col="white")
plot.action[["photo"]](x.legend[1], y.legend[1], col="white")
rect(
x.legend[3:length(x.legend)] - sz.square/2,
y.legend[3:length(y.legend)] - sz.square/2,
x.legend[3:length(x.legend)] + sz.square/2,
y.legend[3:length(y.legend)] + sz.square/2,
col=col)
text(x.legend + sz.square, y.legend,
c("viewed", "photographed", levels(progress$progress)),
adj=0, cex=1.3)