在 R 直方图中作为参数 'breaks' 传递的单个数字意味着什么?

What does a single number mean when passed as parameter 'breaks' in an R histogram?

我正在学习在 R 中绘制直方图,但我对单个数字的参数“中断”有一些问题。在帮助中,它说:

breaks: a single number giving the number of cells for the histogram

我做了以下实验:

data("women")
hist(women$weight, breaks = 7)

我预计它应该给我 7 个箱子,但结果不是我预期的!它给了我 10 个箱子。

你知道,breaks = 7是什么意思吗?帮助中的“细胞数”是什么意思?

仔细阅读breaks参数帮助页面到最后,它说:

breaks
one of:

  1. a vector giving the breakpoints between histogram cells,
  2. a function to compute the vector of breakpoints,
  3. a single number giving the number of cells for the histogram,
  4. a character string naming an algorithm to compute the number of cells (see ‘Details’),

  5. a function to compute the number of cells.

In the last three cases the number is a suggestion only; the breakpoints will be set to pretty values. If breaks is a function, the x vector is supplied to it as the only argument.

因此,如您所见,n 仅被视为 "suggestion",它可能试图接近该值,但这取决于输入值以及它们是否可以很好地拆分放入 n 个桶中(它使用函数 pretty 来计算它们)。

因此,强制中断数的唯一方法是提供单元格之间的间隔断点向量。

例如

data("women")
n <- 7
minv <- min(women$weight)
maxv <- max(women$weight)
breaks <- c(minv, minv + cumsum(rep.int((maxv - minv) / n, n-1)), maxv)
hist(women$weight, breaks = breaks)