R代码显示存储在每个箱子中的实际连续值?
R code to show the actual continuous values stored in each bin?
举个简单的例子,
到 "bin" 1000(连续值)数据点
在 10 个箱子(类别)中,
每个 bin 中有 100 个数据点:
x <- rnorm(1000, mean=0, sd=50)
# Next, let's say we want to create ten bins
# with equal number of observations (100), in each bin:
bins <- 10
cutpoints <- quantile(x,(0:bins)/bins)
# The cutpoints variable
# holds a vector of the cutpoints used to bin the data.
# Finally we perform the binning to form the categories variable:
binned <- cut(x,cutpoints,include.lowest=TRUE)
summary(binned)
[-152,-61] (-61,-40] (-40,-23.9]
100 100 100
(-23.9,-10.2] (-10.2,2.86] (2.86,15.4]
100 100 100
(15.4,25.9] (25.9,44.1] (44.1,64.7]
100 100 100
(64.7,186]
100
如您所见,
最后的摘要代码给你
每个 bin 中的 x 值数量,
(即:100 行值)。
我的问题:
如何显示 实际的 100 个 x 值
在每个 bin 内加上它的 x 行 #(或行名)??
什么是实际的 R 代码
获取 3 列数据框,(列:Bin、Rowname 和 Values)
结构是这样的?:
Bin Rowname Values
[-152,-61] [25] -78.2
[28] -82.1
[75] -99.7 etc.....
(-61,-40] [18]-45.0
[26]-68.4 etc....
谢谢!
您已经完成了所需的一切,除了将其包装成 data.frame
head(data.frame(Values=x, Bin=binned, Rowname=seq_along(x))[order(binned), ])
# Values Bin Rowname
# 2 -66.88718 [-189,-64.7] 2
# 5 -99.08521 [-189,-64.7] 5
# 8 -95.06063 [-189,-64.7] 8
# 10 -95.04592 [-189,-64.7] 10
# 15 -78.48819 [-189,-64.7] 15
# 28 -78.49396 [-189,-64.7] 28
虽然您不需要行名列,因为 data.frame
保留行名属性,即 rownames(yourData)
举个简单的例子, 到 "bin" 1000(连续值)数据点 在 10 个箱子(类别)中, 每个 bin 中有 100 个数据点:
x <- rnorm(1000, mean=0, sd=50)
# Next, let's say we want to create ten bins
# with equal number of observations (100), in each bin:
bins <- 10
cutpoints <- quantile(x,(0:bins)/bins)
# The cutpoints variable
# holds a vector of the cutpoints used to bin the data.
# Finally we perform the binning to form the categories variable:
binned <- cut(x,cutpoints,include.lowest=TRUE)
summary(binned)
[-152,-61] (-61,-40] (-40,-23.9]
100 100 100
(-23.9,-10.2] (-10.2,2.86] (2.86,15.4]
100 100 100
(15.4,25.9] (25.9,44.1] (44.1,64.7]
100 100 100
(64.7,186]
100
如您所见, 最后的摘要代码给你 每个 bin 中的 x 值数量, (即:100 行值)。
我的问题:
如何显示 实际的 100 个 x 值
在每个 bin 内加上它的 x 行 #(或行名)??
什么是实际的 R 代码
获取 3 列数据框,(列:Bin、Rowname 和 Values)
结构是这样的?:
Bin Rowname Values
[-152,-61] [25] -78.2
[28] -82.1
[75] -99.7 etc.....
(-61,-40] [18]-45.0
[26]-68.4 etc....
谢谢!
您已经完成了所需的一切,除了将其包装成 data.frame
head(data.frame(Values=x, Bin=binned, Rowname=seq_along(x))[order(binned), ])
# Values Bin Rowname
# 2 -66.88718 [-189,-64.7] 2
# 5 -99.08521 [-189,-64.7] 5
# 8 -95.06063 [-189,-64.7] 8
# 10 -95.04592 [-189,-64.7] 10
# 15 -78.48819 [-189,-64.7] 15
# 28 -78.49396 [-189,-64.7] 28
虽然您不需要行名列,因为 data.frame
保留行名属性,即 rownames(yourData)