小提琴图:相邻取值范围是如何确定的,为什么与箱线图不同?

Violin plot: How is the adjacent value range determined, and why is it different from boxplot?

理论上vioplot包的violinplot是箱线图+密度函数。

在"boxplot part"、

由以上所获:

Hintze, J. L. and R. D. Nelson (1998). Violin plots: a box plot-density trace synergism. The American Statistician, 52(2):181-4.

让我用一个简单的例子来说明这一点:

b <- c(1:10, 20)

par(mfrow = c(1,2))
boxplot(b, range=1.5)
vioplot(b, range=1.5 )

R的箱线图的定义是(借用ggplot's help的话题):

The upper whisker extends from the hinge to the highest value that is within 1.5 * IQR of the hinge, where IQR is the inter-quartile range, or distance between the first and third quartiles.

浏览vioplot的source code,我们看到upper[i] <- min(q3[i] + range*iqd, data.max)

因此,让我们尝试重现上胡须值:

# vioplot draws
quantile(b, 0.75) + 1.5 * IQR(b)
# 16

# boxplot draws
max(b[b <= quantile(b, 0.75) + 1.5 * IQR(b)])
# 10