每个变量的平均距离
Average distance of each variable
可以使用下面给出的公式计算每个变量的平均距离。这里 d 表示感兴趣的变量与其 parent 个变量的平均距离。 p和q代表该变量对其parents的不同状态的条件概率,i代表child节点的不同状态,n代表集合的状态数parent 个节点。
这是一个有两个状态 parent 的例子。
我要计算的是:
Average {[(0.8286-0.6308)^2],[(0.1364-0.2347)^2],...,[(0.0017-0.0049)^2]}
=0.0107
当我有超过 3 个状态时,我需要找到:
Average {[(a-b)^2+(a-c)^2+(b-c)^2)],....
我试过了:
x1<-c(0.8286,0.1364,0.0300,0.0033,0.0017)
x2<-c(0.6308,0.2347,0.0807,0.0489,0.0049)
dist(rbind(x1,x2))
但它只是给我欧氏距离。
对不起,一开始我有误会。现在这就是您真正可以做的:
d <- function(mat) {
ind <- as.numeric(combn(nrow(mat), 2))
n <- length(ind) / 2
mean(apply(mat, 2, function(x) {y <- x[ind]; sum((y[seq(from = 1, length = n, by = 2)] - y[seq(from = 2, length = n, by = 2)])^2)}))/n
}
例如,假设你有概率 table:
set.seed(0); mat <- matrix(runif(20), 4, 5)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.8966972 0.9082078 0.66079779 0.1765568 0.4976992
# [2,] 0.2655087 0.2016819 0.62911404 0.6870228 0.7176185
# [3,] 0.3721239 0.8983897 0.06178627 0.3841037 0.9919061
# [4,] 0.5728534 0.9446753 0.20597457 0.7698414 0.3800352
d(mat) # 0.1775407
对于您的 2 个州的示例数据:
x1<-c(0.8286,0.1364,0.0300,0.0033,0.0017)
x2<-c(0.6308,0.2347,0.0807,0.0489,0.0049)
d(rbind(x1,x2)) # 0.01068956
除非我误解了问题,否则答案很简单
mean((x1-x2)^2)
证明:
> (x1-x2)^2
[1] 0.03912484 0.00966289 0.00257049 0.00207936 0.00001024
> 0.8286-0.6308
[1] 0.1978
> 0.1978^2
[1] 0.03912484
可以使用下面给出的公式计算每个变量的平均距离。这里 d 表示感兴趣的变量与其 parent 个变量的平均距离。 p和q代表该变量对其parents的不同状态的条件概率,i代表child节点的不同状态,n代表集合的状态数parent 个节点。
这是一个有两个状态 parent 的例子。
Average {[(0.8286-0.6308)^2],[(0.1364-0.2347)^2],...,[(0.0017-0.0049)^2]}
=0.0107
当我有超过 3 个状态时,我需要找到:
Average {[(a-b)^2+(a-c)^2+(b-c)^2)],....
我试过了:
x1<-c(0.8286,0.1364,0.0300,0.0033,0.0017)
x2<-c(0.6308,0.2347,0.0807,0.0489,0.0049)
dist(rbind(x1,x2))
但它只是给我欧氏距离。
对不起,一开始我有误会。现在这就是您真正可以做的:
d <- function(mat) {
ind <- as.numeric(combn(nrow(mat), 2))
n <- length(ind) / 2
mean(apply(mat, 2, function(x) {y <- x[ind]; sum((y[seq(from = 1, length = n, by = 2)] - y[seq(from = 2, length = n, by = 2)])^2)}))/n
}
例如,假设你有概率 table:
set.seed(0); mat <- matrix(runif(20), 4, 5)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.8966972 0.9082078 0.66079779 0.1765568 0.4976992
# [2,] 0.2655087 0.2016819 0.62911404 0.6870228 0.7176185
# [3,] 0.3721239 0.8983897 0.06178627 0.3841037 0.9919061
# [4,] 0.5728534 0.9446753 0.20597457 0.7698414 0.3800352
d(mat) # 0.1775407
对于您的 2 个州的示例数据:
x1<-c(0.8286,0.1364,0.0300,0.0033,0.0017)
x2<-c(0.6308,0.2347,0.0807,0.0489,0.0049)
d(rbind(x1,x2)) # 0.01068956
除非我误解了问题,否则答案很简单
mean((x1-x2)^2)
证明:
> (x1-x2)^2
[1] 0.03912484 0.00966289 0.00257049 0.00207936 0.00001024
> 0.8286-0.6308
[1] 0.1978
> 0.1978^2
[1] 0.03912484