如何通过在 x 轴上使用两个变量和在 y 轴上使用分组变量来制作条形图?
How to make a bar-chart by using two variables on x-axis and a grouped variable on y-axis?
希望这次我问对了问题!如果不让我知道!
我想编写一个类似于此的分组条形图(我刚刚在绘画中创建):
enter image description here
我将两者都创建为翻转,实际上翻转与否并不重要。因此,与此类似的情节也将非常有用:
happy 和 lifesatisfied 这两个变量都是从 0 到 10 的标度值。工作时间是一个分组值,包含 43+、37-42、33-36、27-32 和 <27。
我的数据集的一个非常相似的例子(我只是改变了值和顺序,我还有更多的观察):
Working hours
happy
lifestatisfied
contry
37-42
7
9
DK
<27
8
8
SE
43+
7
8
DK
33-36
6
6
SE
37-42
7
5
NO
<27
4
7
NO
我试图找到类似的示例,并基于此尝试以下列方式对条形图进行编码,但它不起作用:
df2 <- datafilteredwomen %>%
pivot_longer(cols = c("happy", "stflife"), names_to = "var", values_to = "Percentage")
ggplot(df2) +
geom_bar(aes(x = Percentage, y = workinghours, fill = var ), stat = "identity", position = "dodge") + theme_minimal()
它给出的情节不是 correct/what 我想要的:
enter image description here
第二次尝试:
forplot = datafilteredwomen %>% group_by(workinghours, happy, stflife) %>% summarise(count = n()) %>% mutate(proportion = count/sum(count))
ggplot(forplot, aes(workinghours, proportion, fill = as.factor(happy))) +
geom_bar(position = "dodge", stat = "identity", color = "black")
给出了这个情节:
enter image description here
第三次尝试 - 使用了 ggplot2 构建器插件:
library(dplyr)
library(ggplot2)
datafilteredwomen %>%
filter(!is.na(workinghours)) %>%
ggplot() +
aes(x = workinghours, group = happy, weight = happy) +
geom_bar(position = "dodge",
fill = "#112446") +
theme_classic() + scale_y_continuous(labels = scales::percent)
给出了这个情节:
enter image description here
但是 none 我的尝试是我想要的.. 真的希望有人能帮助我,如果可能的话!
使用这个示例数据框 df:
df <- structure(list(Working.hours = c("37-42", "37-42", "<27", "<27",
"43+", "43+", "33-36", "33-36", "37-42", "37-42", "<27", "<27"
), country = c("DK", "DK", "SE", "SE", "DK", "DK", "SE", "SE",
"NO", "NO", "NO", "NO"), criterion = c("happy", "lifesatisfied",
"happy", "lifesatisfied", "happy", "lifesatisfied", "happy",
"lifesatisfied", "happy", "lifesatisfied", "happy", "lifesatisfied"
), score = c(7L, 9L, 8L, 8L, 7L, 8L, 6L, 6L, 7L, 5L, 4L, 7L)), row.names = c(NA,
-12L), class = c("tbl_df", "tbl", "data.frame"))
您可以这样进行:
library(dplyr)
library(ggplot2)
df <-
df %>%
pivot_longer(cols = c(happy, lifesatisfied),
names_to = 'criterion',
values_to = 'score'
)
df %>%
ggplot(aes(x = Working.hours,
y = score,
fill = criterion)) +
geom_col(position = 'dodge') +
coord_flip()
有关选择颜色,请参阅 ?scale_fill_manual
,有关格式化图例等。Whosebug 上相关问题的许多现有答案。
在与 OP 交谈后,我找到了他的数据源并提出了这个解决方案。如果有点乱,我深表歉意,我只用了 6 个月的 R。为了便于再现,我预先选择了原始数据集中使用的变量。
data <- structure(list(wkhtot = c(40, 8, 50, 40, 40, 50, 39, 48, 45,
16, 45, 45, 52, 45, 50, 37, 50, 7, 37, 36), happy = c(7, 8, 10,
10, 7, 7, 7, 6, 8, 10, 8, 10, 9, 6, 9, 9, 8, 8, 9, 7), stflife = c(8,
8, 10, 10, 7, 7, 8, 6, 8, 10, 9, 10, 9, 5, 9, 9, 8, 8, 7, 7)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
这是需要的包。
require(dplyr)
require(ggplot2)
require(tidyverse)
这里我对数据进行了操作,并评论了我的推理。
data <- data %>%
select(wkhtot, happy, stflife) %>% #Select the wanted variables
rename(Happy = happy) %>% #Rename for graphical sake
rename("Life Satisfied" = stflife) %>%
na.omit() %>% # remove NA values
group_by(WorkingHours = cut(wkhtot, c(-Inf, 27, 32,36,42,Inf))) %>% #Create the ranges
select(WorkingHours, Happy, "Life Satisfied") %>% #Select the variables again
pivot_longer(cols = c(`Happy`, `Life Satisfied`), names_to = "Criterion", values_to = "score") %>% # pivot the df longer for plotting
group_by(WorkingHours, Criterion)
data$Criterion <- as.factor(data$Criterion) #Make criterion a factor for graphical reasons
更多的数据准备
# Creating the percentage
data.plot <- data %>%
group_by(WorkingHours, Criterion) %>%
summarise_all(sum) %>% # get the sums for score by working hours and criterion
group_by(WorkingHours) %>%
mutate(tot = sum(score)) %>%
mutate(freq =round(score/tot *100, digits = 2)) # get percentage
正在创建情节。
# Plotting
ggplot(data.plot, aes(x = WorkingHours, y = freq, fill = Criterion)) +
geom_col(position = "dodge") +
geom_text(aes(label = freq),
position = position_dodge(width = 0.9),
vjust = 1) +
xlab("Working Hours") +
ylab("Percentage")
请告诉我是否有更简洁或更简单的方法!!
B
希望这次我问对了问题!如果不让我知道!
我想编写一个类似于此的分组条形图(我刚刚在绘画中创建):
enter image description here
我将两者都创建为翻转,实际上翻转与否并不重要。因此,与此类似的情节也将非常有用:
happy 和 lifesatisfied 这两个变量都是从 0 到 10 的标度值。工作时间是一个分组值,包含 43+、37-42、33-36、27-32 和 <27。
我的数据集的一个非常相似的例子(我只是改变了值和顺序,我还有更多的观察):
Working hours | happy | lifestatisfied | contry |
---|---|---|---|
37-42 | 7 | 9 | DK |
<27 | 8 | 8 | SE |
43+ | 7 | 8 | DK |
33-36 | 6 | 6 | SE |
37-42 | 7 | 5 | NO |
<27 | 4 | 7 | NO |
我试图找到类似的示例,并基于此尝试以下列方式对条形图进行编码,但它不起作用:
df2 <- datafilteredwomen %>%
pivot_longer(cols = c("happy", "stflife"), names_to = "var", values_to = "Percentage")
ggplot(df2) +
geom_bar(aes(x = Percentage, y = workinghours, fill = var ), stat = "identity", position = "dodge") + theme_minimal()
它给出的情节不是 correct/what 我想要的: enter image description here
第二次尝试:
forplot = datafilteredwomen %>% group_by(workinghours, happy, stflife) %>% summarise(count = n()) %>% mutate(proportion = count/sum(count))
ggplot(forplot, aes(workinghours, proportion, fill = as.factor(happy))) +
geom_bar(position = "dodge", stat = "identity", color = "black")
给出了这个情节: enter image description here
第三次尝试 - 使用了 ggplot2 构建器插件:
library(dplyr)
library(ggplot2)
datafilteredwomen %>%
filter(!is.na(workinghours)) %>%
ggplot() +
aes(x = workinghours, group = happy, weight = happy) +
geom_bar(position = "dodge",
fill = "#112446") +
theme_classic() + scale_y_continuous(labels = scales::percent)
给出了这个情节: enter image description here
但是 none 我的尝试是我想要的.. 真的希望有人能帮助我,如果可能的话!
使用这个示例数据框 df:
df <- structure(list(Working.hours = c("37-42", "37-42", "<27", "<27",
"43+", "43+", "33-36", "33-36", "37-42", "37-42", "<27", "<27"
), country = c("DK", "DK", "SE", "SE", "DK", "DK", "SE", "SE",
"NO", "NO", "NO", "NO"), criterion = c("happy", "lifesatisfied",
"happy", "lifesatisfied", "happy", "lifesatisfied", "happy",
"lifesatisfied", "happy", "lifesatisfied", "happy", "lifesatisfied"
), score = c(7L, 9L, 8L, 8L, 7L, 8L, 6L, 6L, 7L, 5L, 4L, 7L)), row.names = c(NA,
-12L), class = c("tbl_df", "tbl", "data.frame"))
您可以这样进行:
library(dplyr)
library(ggplot2)
df <-
df %>%
pivot_longer(cols = c(happy, lifesatisfied),
names_to = 'criterion',
values_to = 'score'
)
df %>%
ggplot(aes(x = Working.hours,
y = score,
fill = criterion)) +
geom_col(position = 'dodge') +
coord_flip()
有关选择颜色,请参阅 ?scale_fill_manual
,有关格式化图例等。Whosebug 上相关问题的许多现有答案。
在与 OP 交谈后,我找到了他的数据源并提出了这个解决方案。如果有点乱,我深表歉意,我只用了 6 个月的 R。为了便于再现,我预先选择了原始数据集中使用的变量。
data <- structure(list(wkhtot = c(40, 8, 50, 40, 40, 50, 39, 48, 45,
16, 45, 45, 52, 45, 50, 37, 50, 7, 37, 36), happy = c(7, 8, 10,
10, 7, 7, 7, 6, 8, 10, 8, 10, 9, 6, 9, 9, 8, 8, 9, 7), stflife = c(8,
8, 10, 10, 7, 7, 8, 6, 8, 10, 9, 10, 9, 5, 9, 9, 8, 8, 7, 7)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
这是需要的包。
require(dplyr)
require(ggplot2)
require(tidyverse)
这里我对数据进行了操作,并评论了我的推理。
data <- data %>%
select(wkhtot, happy, stflife) %>% #Select the wanted variables
rename(Happy = happy) %>% #Rename for graphical sake
rename("Life Satisfied" = stflife) %>%
na.omit() %>% # remove NA values
group_by(WorkingHours = cut(wkhtot, c(-Inf, 27, 32,36,42,Inf))) %>% #Create the ranges
select(WorkingHours, Happy, "Life Satisfied") %>% #Select the variables again
pivot_longer(cols = c(`Happy`, `Life Satisfied`), names_to = "Criterion", values_to = "score") %>% # pivot the df longer for plotting
group_by(WorkingHours, Criterion)
data$Criterion <- as.factor(data$Criterion) #Make criterion a factor for graphical reasons
更多的数据准备
# Creating the percentage
data.plot <- data %>%
group_by(WorkingHours, Criterion) %>%
summarise_all(sum) %>% # get the sums for score by working hours and criterion
group_by(WorkingHours) %>%
mutate(tot = sum(score)) %>%
mutate(freq =round(score/tot *100, digits = 2)) # get percentage
正在创建情节。
# Plotting
ggplot(data.plot, aes(x = WorkingHours, y = freq, fill = Criterion)) +
geom_col(position = "dodge") +
geom_text(aes(label = freq),
position = position_dodge(width = 0.9),
vjust = 1) +
xlab("Working Hours") +
ylab("Percentage")
请告诉我是否有更简洁或更简单的方法!!
B