R和手动计算中接近函数的差异

Difference in closeness function in R and manual computation

我有一个无向加权图,我想在其中计算接近度度量。根据 igraph 文档,它是平均最短路径的倒数。我计算了最短路径并将它们的平均值取反,但仍然没有得到与 closeness 函数中相同的值。为什么会这样?我错过了什么?

这是我的代码:

dput(c$estimate)
structure(c(1, 10000, 10000, 2.69857209553848, 5.77115055524614, 
1.95672007809809, 2.98690863617922, 1.92161847347611, 10000, 
10000, 10000, 10000, 1, 1.97201563662035, 5.4078452590091, 10000, 
6.8534542161595, 3.51453278996925, 10000, 10000, 2.08964950396744, 
10000, 10000, 1.97201563662034, 1, 2.78868220464485, 10000, 3.41857460835551, 
10000, 1.96044036389546, 10000, 10000, 10000, 2.69857209553835, 
5.40784525900909, 2.78868220464486, 1, 10000, 10000, 3.54317409176484, 
10000, 2.33889236077342, 10000, 10000, 5.77115055524604, 10000, 
10000, 10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 1.95672007809807, 
6.85345421615961, 3.41857460835555, 10000, 10000, 1, 10000, 10000, 
2.49075030691086, 10000, 10000, 2.98690863617922, 3.51453278996926, 
10000, 3.54317409176474, 10000, 10000, 1, 10000, 10000, 10000, 
1.73687483250751, 1.92161847347613, 10000, 1.96044036389548, 
10000, 10000, 10000, 10000, 1, 4.24032760636799, 3.11756167665886, 
5.07827243244947, 10000, 10000, 10000, 2.33889236077345, 10000, 
2.49075030691088, 10000, 4.24032760636804, 1, 10000, 1.69643890905686, 
10000, 2.08964950396742, 10000, 10000, 10000, 10000, 10000, 3.11756167665892, 
10000, 1, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 1.73687483250752, 
5.0782724324492, 1.69643890905687, 10000, 1), .Dim = c(11L, 11L
), .Dimnames = list(c("jpm", "gs", "ms", "bofa", "schwab", "brk", 
"wf", "citi", "amex", "spgl", "pnc"), c("jpm", "gs", "ms", "bofa", 
"schwab", "brk", "wf", "citi", "amex", "spgl", "pnc")))

g <- graph_from_adjacency_matrix(c$estimate, weighted="wt", mode="undirected", diag=F)

closeness(g,weights= round(E(g)$wt,2))
       jpm         gs         ms       bofa     schwab        brk         wf       citi 
0.02503756 0.01877229 0.02203614 0.02151463 0.01088495 0.02189621 0.02226180 0.02418380 
      amex       spgl        pnc 
0.01988072 0.01632387 0.01913509 

# manual
a <- shortest.paths(g,weights=round(E(g)$wt,2))
1/rowMeans(a)
      jpm        gs        ms      bofa    schwab       brk        wf      citi      amex 
0.2799695 0.2143414 0.2435245 0.2457002 0.1205876 0.2408583 0.2448798 0.2660218 0.2276490 
     spgl       pnc 
0.1855914 0.2140078 

您可能需要注意两个地方:

  1. 您应该在 closeness
  2. 中启用 normalized = TRUE
  3. 当您尝试使用最短路径长度来定义接近中心性时,您应该知道该距离是对不包括自身的距离的平均。因此,vcount(g)-1 是平均分母,而不是 vcount(g),这就是为什么不应该使用 rowMeans.

从下面的代码可以看出,两种方法的结果很接近(精度可能有细微差别,但我不确定)

> closeness(g,weights = E(g)$wt,normalized = TRUE)
      jpm        gs        ms      bofa    schwab       brk        wf      citi 
0.2504451 0.1876864 0.2203154 0.2151935 0.1088503 0.2190827 0.2226391 0.2418350
     amex      spgl       pnc
0.1988941 0.1632546 0.1914826

> (vcount(g) - 1) / rowSums(shortest.paths(g, weights = E(g)$wt))
      jpm        gs        ms      bofa    schwab       brk        wf      citi
0.2545725 0.1947856 0.2213624 0.2234093 0.1096228 0.2190827 0.2226391 0.2418350 
     amex      spgl       pnc
0.2070431 0.1687258 0.1946688