ggplot2 中的分组条形图
Grouped bar plot in ggplot2
我正在尝试制作一个包含长格式数据的分组条形图。
这是数据:
structure(list(group = c("group1", "group2", "group3", "group1",
"group2", "group1", "group1", "group1", "group4", "group1", "group4",
"group4", "group1", "group4", "group1", "group1", "group2", "group1",
"group4", "group2", "group4", "group2", "group3", "group3", "group1",
"group1", "group3", "group3", "group1", "group1", "group3", "group1",
"group4", "group3", "group3", "group1", "group2", "group1", "group4",
"group1", "group3", "group3", "group3", "group2", "group2", "group4",
"group3", "group3", "group3", "group2", "group3", "group2", "group1",
"group1", "group3", "group1", "group1", "group2", "group4", "group1",
"group4", "group1", "group1", "group4", "group1", "group3", "group4",
"group1", "group4", "group2", "group4", "group1", "group2", "group4",
"group1", "group4", "group1", "group2", "group1", "group1", "group1",
"group1", "group2", "group1", "group3", "group1", "group1", "group1",
"group3", "group4", "group1", "group3", "group1", "group3", "group4",
"group1", "group2", "group1", "group3", "group1"), category = c("category4",
"category5", "category2", "category4", "category3", "category6",
"category3", "category1", "category4", "category2", "category6",
"category6", "category5", "category5", "category4", "category4",
"category1", "category6", "category1", "category4", "category6",
"category6", "category2", "category6", "category3", "category2",
"category6", "category3", "category6", "category1", "category6",
"category2", "category2", "category2", "category5", "category1",
"category1", "category4", "category3", "category4", "category4",
"category5", "category1", "category3", "category5", "category2",
"category2", "category5", "category5", "category2", "category6",
"category6", "category5", "category1", "category4", "category3",
"category6", "category1", "category6", "category3", "category2",
"category2", "category3", "category2", "category2", "category5",
"category4", "category4", "category4", "category4", "category1",
"category5", "category6", "category5", "category4", "category5",
"category1", "category2", "category3", "category5", "category3",
"category2", "category4", "category6", "category4", "category6",
"category1", "category4", "category4", "category3", "category4",
"category5", "category5", "category6", "category4", "category3",
"category5", "category3", "category3", "category1"), count = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
当我运行以下内容时:
pivot_sample %>%
ggplot(aes(x=group,fill=category))+
geom_bar()
stat_count()
默认函数似乎与默认 position="stack"
一起工作得很好
但是,当我在下面的代码中切换到 position="dodge"
时:
pivot_sample %>%
ggplot(aes(x=group,y=count,fill=category))+
geom_bar(position = "dodge",stat = "identity")
它不会计算 count
变量。
我确信我缺少一些基本的东西,可以使用另一个视角。
我是否需要为 aes()
中的 y=
参数使用 count
函数?
我们将不胜感激!
OP,这里的简单答案就是将 position="dodge"
添加到您的原始情节代码中,并且可以根据组审美(未指定,因此默认为bar geom 使用 fill
美学作为分组依据):
pivot_sample %>%
ggplot(aes(x=group, fill=category)) +
geom_bar(position='dodge')
原因是 geom_bar
中 stat
参数的默认选项是 stat="count"
。这将计算所有观察值并沿 y 轴绘制“计数”。要访问它,您可以使用 ..
表示法:..count..
,但对于 geom_bar()
则没有必要。因此,下面的代码向您展示了一种显示相同情节的长表格:
pivot_sample %>%
ggplot(aes(x=group, fill=category)) +
geom_bar(position='dodge', aes(y=..count..), stat="count")
请注意,您的数据框有一个名为“count”的列,但是 pivot_sample$count
不是您指定和使用 ..count..
时访问的内容。 stat="count"
函数后的结果是 运行.
使用 stat="identity"
时发生了什么?好吧,"identity"
统计数据在 y 轴上绘制了实际值。您指定了 y=count
,这意味着 pivot_sample$count
列的值绘制在每个分组和类别中。 geom_bar
和 stat="identity"
与使用 geom_col()
相同(在这种情况下应该使用),这将需要定义 x
和 y
美学。在这种情况下,“身份”将导致将 y 美学的所有值相加 - 或 pivot_sample$count
.
在您使用 stat="identity"
显示的图中,您看到 count
的值表示为条高度,等于每个条的所有 pivot_sample$count
值的总和。对于数据中的该列,您没有很多值 = 1,所以这就是它看起来像它的样子的原因。
请注意,geom_bar()
使用 stat="count"
计数 观察 ,而 stat="identity"
总计 值 .
我正在尝试制作一个包含长格式数据的分组条形图。
这是数据:
structure(list(group = c("group1", "group2", "group3", "group1",
"group2", "group1", "group1", "group1", "group4", "group1", "group4",
"group4", "group1", "group4", "group1", "group1", "group2", "group1",
"group4", "group2", "group4", "group2", "group3", "group3", "group1",
"group1", "group3", "group3", "group1", "group1", "group3", "group1",
"group4", "group3", "group3", "group1", "group2", "group1", "group4",
"group1", "group3", "group3", "group3", "group2", "group2", "group4",
"group3", "group3", "group3", "group2", "group3", "group2", "group1",
"group1", "group3", "group1", "group1", "group2", "group4", "group1",
"group4", "group1", "group1", "group4", "group1", "group3", "group4",
"group1", "group4", "group2", "group4", "group1", "group2", "group4",
"group1", "group4", "group1", "group2", "group1", "group1", "group1",
"group1", "group2", "group1", "group3", "group1", "group1", "group1",
"group3", "group4", "group1", "group3", "group1", "group3", "group4",
"group1", "group2", "group1", "group3", "group1"), category = c("category4",
"category5", "category2", "category4", "category3", "category6",
"category3", "category1", "category4", "category2", "category6",
"category6", "category5", "category5", "category4", "category4",
"category1", "category6", "category1", "category4", "category6",
"category6", "category2", "category6", "category3", "category2",
"category6", "category3", "category6", "category1", "category6",
"category2", "category2", "category2", "category5", "category1",
"category1", "category4", "category3", "category4", "category4",
"category5", "category1", "category3", "category5", "category2",
"category2", "category5", "category5", "category2", "category6",
"category6", "category5", "category1", "category4", "category3",
"category6", "category1", "category6", "category3", "category2",
"category2", "category3", "category2", "category2", "category5",
"category4", "category4", "category4", "category4", "category1",
"category5", "category6", "category5", "category4", "category5",
"category1", "category2", "category3", "category5", "category3",
"category2", "category4", "category6", "category4", "category6",
"category1", "category4", "category4", "category3", "category4",
"category5", "category5", "category6", "category4", "category3",
"category5", "category3", "category3", "category1"), count = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
当我运行以下内容时:
pivot_sample %>%
ggplot(aes(x=group,fill=category))+
geom_bar()
stat_count()
默认函数似乎与默认 position="stack"
一起工作得很好
但是,当我在下面的代码中切换到 position="dodge"
时:
pivot_sample %>%
ggplot(aes(x=group,y=count,fill=category))+
geom_bar(position = "dodge",stat = "identity")
count
变量。
我确信我缺少一些基本的东西,可以使用另一个视角。
我是否需要为 aes()
中的 y=
参数使用 count
函数?
我们将不胜感激!
OP,这里的简单答案就是将 position="dodge"
添加到您的原始情节代码中,并且可以根据组审美(未指定,因此默认为bar geom 使用 fill
美学作为分组依据):
pivot_sample %>%
ggplot(aes(x=group, fill=category)) +
geom_bar(position='dodge')
原因是 geom_bar
中 stat
参数的默认选项是 stat="count"
。这将计算所有观察值并沿 y 轴绘制“计数”。要访问它,您可以使用 ..
表示法:..count..
,但对于 geom_bar()
则没有必要。因此,下面的代码向您展示了一种显示相同情节的长表格:
pivot_sample %>%
ggplot(aes(x=group, fill=category)) +
geom_bar(position='dodge', aes(y=..count..), stat="count")
请注意,您的数据框有一个名为“count”的列,但是 pivot_sample$count
不是您指定和使用 ..count..
时访问的内容。 stat="count"
函数后的结果是 运行.
使用 stat="identity"
时发生了什么?好吧,"identity"
统计数据在 y 轴上绘制了实际值。您指定了 y=count
,这意味着 pivot_sample$count
列的值绘制在每个分组和类别中。 geom_bar
和 stat="identity"
与使用 geom_col()
相同(在这种情况下应该使用),这将需要定义 x
和 y
美学。在这种情况下,“身份”将导致将 y 美学的所有值相加 - 或 pivot_sample$count
.
在您使用 stat="identity"
显示的图中,您看到 count
的值表示为条高度,等于每个条的所有 pivot_sample$count
值的总和。对于数据中的该列,您没有很多值 = 1,所以这就是它看起来像它的样子的原因。
请注意,geom_bar()
使用 stat="count"
计数 观察 ,而 stat="identity"
总计 值 .