使用一系列和不一致的数据控制 ggplot2 图中的列宽

Question

在我为下面的 MWE 创建的人工数据中，我试图展示我在 R 中创建的脚本的本质。从这段代码生成的图表可以看出，在我的一个条件我没有 "No" 值来完成这个系列。

有人告诉我，除非我可以制作最后一列，否则遗憾的是没有额外的系列像图表中其他列一样薄，我将不允许使用这些图表。遗憾的是，这是一个问题，因为我编写的脚本会同时生成数百个图表，包括统计数据、显着性指标、传播误差线和智能 y 轴调整（这些功能当然不存在于 MWE 中）。

一些其他评论：

这个异常列不能保证在图表的末尾......所以手动调整以强制系列改变颜色并反转顺序留下额外的 space右侧不可靠。
我试图将数据模拟为常数 0 以便系列 "is present" 但不可见，但正如预期的那样，系列 c 的顺序（否，是）使它跳过 space 这也是不可接受的。这就是这里回答同一个问题的方式，但遗憾的是，由于我的限制，它对我不起作用：Consistent width for geom_bar in the event of missing data and Include space for missing factor level used in fill aesthetics in geom_boxplot
我也尝试过对小平面执行此操作，但出现了许多问题，包括换行符以及我添加到 x 轴的注释中的错误。

MWE：

library(ggplot2)

print("Program started")

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- as.data.frame(cbind(x,s,y))

print(df)

gg <- ggplot(data = df, aes_string(x="x", y="y", weight="y", ymin=paste0("y"), ymax=paste0("y"), fill="s"));
dodge_str <- position_dodge(width = NULL, height = NULL);
gg <- gg + geom_bar(position=dodge_str, stat="identity", size=.3, colour = "black")

print(gg)

print("Program complete - a graph should be visible.")

Answer 1

是的，我知道发生了什么：你需要格外小心，因为因子是因子，数字是数字。就我而言，stringsAsFactors = FALSE 我有

str(df)
'data.frame':   7 obs. of  3 variables:
 $ x: chr  "1" "2" "3" "1" ...
 $ s: chr  "No" "No" "No" "Yes" ...
 $ y: chr  "1" "2" "3" "2" ...

dput(df)
structure(list(x = c("1", "2", "3", "1", "2", "3", "4"), s = c("No", 
"No", "No", "Yes", "Yes", "Yes", "Yes"), y = c("1", "2", "3", 
"2", "3", "4", "5")), .Names = c("x", "s", "y"), row.names = c(NA, 
-7L), class = "data.frame")

由于 cbind-ing（原文如此！），没有因数和数字变成字符。让我们有另一个数据框：

dff <- data.frame(x = factor(df$x), s = factor(df$s), y = as.numeric(df$y))

添加 "dummy" 行（对于您的示例手动添加，请查看链接问题中的 expand.grid 版本以了解如何自动执行此操作）：

dff <- rbind(dff, c(4, "No", NA))

绘图（我删除了额外的 aes）：

ggplot(data = df3, aes(x, y, fill=s)) + 
  geom_bar(position=dodge_str, stat="identity", size=.3, colour="black")

Answer 2

您自己计算条形的 x 坐标（如下所示），您可以获得一个可能接近您正在寻找的图表。

x <- c("1","2","3","1","2","3","4")
s <- c("No","No","No","Yes","Yes","Yes","Yes")
y <- c(1,2,3,2,3,4,5)
df <- data.frame(cbind(x,s,y) )
df$x_pos[order(df$x, df$s)] <- 1:nrow(df)
x_stats <- as.data.frame.table(table(df$x), responseName="x_counts")
x_stats$center <- tapply(df$x_pos, df$x, mean)
df <-  merge(df, x_stats, by.x="x", by.y="Var1", all=TRUE)
bar_width <- .7
df$pos <- apply(df, 1, function(x) {xpos=as.numeric(x[4]) 
                                if(x[5] == 1) xpos 
                                else ifelse(x[2]=="No", xpos + .5 -        bar_width/2, xpos - .5 + bar_width/2) } )
 print(df)
gg <- ggplot(data=df, aes(x=pos, y=y, fill=s ) )
gg <- gg + geom_bar(position="identity", stat="identity", size=.3,    colour="black", width=bar_width)
gg <- gg + scale_x_continuous(breaks=df$center,labels=df$x )
plot(gg)

-----编辑------------------------------------ ----------

修改为将标签放置在条形的中心。

给出如下图表

使用一系列和不一致的数据控制 ggplot2 图中的列宽

Control column widths in a ggplot2 graph with a series and inconsistent data

r

series

width

bar-chart

ggplot2