rattle "Info" 在数据集描述中得分
rattle "Info" Score in Description of the dataset
运行 rattle 中的描述性统计,需要知道结果中的 "Info" 是什么。无法在小插图中找到任何信息。这是我所说的示例:
Variable1
n missing unique Info Sum Mean
89588 0 2 0.61 25735 0.2873
我们认为这是 0 比 1 的分数,但我们无法找到确切的定义。
Rattle中使用的describe函数来自HMisc包。
在 HMisc::describe 的文档中,这是关于信息的:
For numeric variables, describe adds an item called Info which is a
relative information measure using the relative efficiency of a
proportional odds/Wilcoxon test on the variable relative to the same
test on a variable that has no ties. Info is related to how continuous
the variable is, and ties are less harmful the more untied values
there are. The formula for Info is one minus the sum of the cubes of
relative frequencies of values divided by one minus the square of the
reciprocal of the sample size. The lowest information comes from a
variable having only one unique values following by a highly skewed
binary variable. Info is reported to two decimal places.
运行 rattle 中的描述性统计,需要知道结果中的 "Info" 是什么。无法在小插图中找到任何信息。这是我所说的示例:
Variable1 n missing unique Info Sum Mean 89588 0 2 0.61 25735 0.2873
我们认为这是 0 比 1 的分数,但我们无法找到确切的定义。
Rattle中使用的describe函数来自HMisc包。
在 HMisc::describe 的文档中,这是关于信息的:
For numeric variables, describe adds an item called Info which is a relative information measure using the relative efficiency of a proportional odds/Wilcoxon test on the variable relative to the same test on a variable that has no ties. Info is related to how continuous the variable is, and ties are less harmful the more untied values there are. The formula for Info is one minus the sum of the cubes of relative frequencies of values divided by one minus the square of the reciprocal of the sample size. The lowest information comes from a variable having only one unique values following by a highly skewed binary variable. Info is reported to two decimal places.