如何将宽数据整理成堆叠条形图 facet_grid 的正确格式?
How to I get wide data tidied into the proper format for a stacked bar facet_grid plot?
我正在尝试将数据放入 3x3 facet_grid 图中,但正在努力寻找正确的整洁组合以使其发挥作用。
我可以设法让一个类别像这样分面:
# ingest some data
df <- structure(list(Q52_bin = structure(c(3L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"medium", "high"), class = "factor"), Q53_bin = structure(c(2L,
3L, 2L, 2L, 2L, 2L), .Label = c("low", "medium", "high"), class = "factor"),
Q57_bin = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"medium", "high"), class = "factor"), Q4 = c("A little",
"Some", "Some", "A great deal", "A lot", "Some")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
# Now let's try and develop a faceted plot using the low/med/high bins we've created above under political_lr, spirituality etc.
# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
df <- df %>%
dplyr::count(response, Q52_bin, sort = TRUE) %>%
dplyr::mutate(response = forcats::fct_rev(forcats::fct_inorder(response)))
# make plot
ggplot(df, aes(x = n, y = response)) +
geom_col(colour = "white") + facet_grid(rows = vars(Q52_bin)) +
## reduce spacing between labels and bars
scale_x_continuous(expand = c(.01, .01)) +
scale_fill_identity(guide = "none") +
## get rid of all elements except y axis labels + adjust plot margin
theme_ipsum_rc() +
theme(plot.margin = margin(rep(15, 4))) +
easy_center_title()
除了 Q52_bin
,我还使用 count() 过滤掉了列。为了获得正确的设置,我相信我需要使用 pivot_longer(),像这样:
# Now let's try and add in rows to represent other kinds of faceting in a 3x3 visualisation
df <- select(climate_experience_data_named, Q52_bin, Q53_bin, Q57_bin, Q4)
# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
# commenting out percentages and labelling in plot as this will need to be handled differently in facets
# for additional faceted columns to work, we need to convert this data to long format so that bin data is integrated into counts
df <- df %>%
pivot_longer(!response, names_to = "bin_name", values_to = "b")
df <- df %>%
dplyr::count(response, bin_name, sort = TRUE)
# Broken plot!
ggplot(df, aes(x = bin_name, y = n)) +
geom_col(colour = "white", stat='identity') + facet_grid(rows = vars(?), cols = vars(bin_name))
目标是在“低”、“中”和“高”处具有如上所示的分面行,列在“Q52_bin”、“Q53_bin”和“[=26”之后=]" 和每个内部的堆叠条,用于表示此数据的李克特风格因素。而且我相信这在我在这里使用 count() 的时候已经中断了。但我似乎无法弄清楚如何重新配置。显然,情节也没有进展。我怀疑这只是一个简单的调整,但似乎超出了我的范围!
我不确定我是否完全理解你最终想要的情节,但我认为从你原来的 df
你可以这样做:
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
df %>%
pivot_longer(!response, names_to = "bin_name", values_to = "b") %>%
count(response, bin_name, b) %>%
ggplot(aes(x=n,y=response)) +
geom_col(color='white') +
facet_grid(vars(bin_name), vars(b))
输出:
我正在尝试将数据放入 3x3 facet_grid 图中,但正在努力寻找正确的整洁组合以使其发挥作用。
我可以设法让一个类别像这样分面:
# ingest some data
df <- structure(list(Q52_bin = structure(c(3L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"medium", "high"), class = "factor"), Q53_bin = structure(c(2L,
3L, 2L, 2L, 2L, 2L), .Label = c("low", "medium", "high"), class = "factor"),
Q57_bin = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("low",
"medium", "high"), class = "factor"), Q4 = c("A little",
"Some", "Some", "A great deal", "A lot", "Some")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
# Now let's try and develop a faceted plot using the low/med/high bins we've created above under political_lr, spirituality etc.
# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
df <- df %>%
dplyr::count(response, Q52_bin, sort = TRUE) %>%
dplyr::mutate(response = forcats::fct_rev(forcats::fct_inorder(response)))
# make plot
ggplot(df, aes(x = n, y = response)) +
geom_col(colour = "white") + facet_grid(rows = vars(Q52_bin)) +
## reduce spacing between labels and bars
scale_x_continuous(expand = c(.01, .01)) +
scale_fill_identity(guide = "none") +
## get rid of all elements except y axis labels + adjust plot margin
theme_ipsum_rc() +
theme(plot.margin = margin(rep(15, 4))) +
easy_center_title()
除了 Q52_bin
,我还使用 count() 过滤掉了列。为了获得正确的设置,我相信我需要使用 pivot_longer(),像这样:
# Now let's try and add in rows to represent other kinds of faceting in a 3x3 visualisation
df <- select(climate_experience_data_named, Q52_bin, Q53_bin, Q57_bin, Q4)
# make column names coherent and simplified
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
# filter out NA values
df <- filter(df, !is.na(response))
# generate new dataframe with sums per category and sort in descending order
# commenting out percentages and labelling in plot as this will need to be handled differently in facets
# for additional faceted columns to work, we need to convert this data to long format so that bin data is integrated into counts
df <- df %>%
pivot_longer(!response, names_to = "bin_name", values_to = "b")
df <- df %>%
dplyr::count(response, bin_name, sort = TRUE)
# Broken plot!
ggplot(df, aes(x = bin_name, y = n)) +
geom_col(colour = "white", stat='identity') + facet_grid(rows = vars(?), cols = vars(bin_name))
目标是在“低”、“中”和“高”处具有如上所示的分面行,列在“Q52_bin”、“Q53_bin”和“[=26”之后=]" 和每个内部的堆叠条,用于表示此数据的李克特风格因素。而且我相信这在我在这里使用 count() 的时候已经中断了。但我似乎无法弄清楚如何重新配置。显然,情节也没有进展。我怀疑这只是一个简单的调整,但似乎超出了我的范围!
我不确定我是否完全理解你最终想要的情节,但我认为从你原来的 df
你可以这样做:
names(df) <- c("Q52_bin", "Q53_bin", "Q57_bin", "response")
df %>%
pivot_longer(!response, names_to = "bin_name", values_to = "b") %>%
count(response, bin_name, b) %>%
ggplot(aes(x=n,y=response)) +
geom_col(color='white') +
facet_grid(vars(bin_name), vars(b))
输出: