将命名向量转换为 R 中的对称矩阵?

Turn a named vector into a symmetric matrix in R?

有人问过类似的问题,但是,none 有必须拆分向量名称的附加元素,所以我要问一个新问题。

我正在尝试将命名向量转换为 R 中的对称矩阵。我的向量包含矩阵中每个值组合的名称。所以我需要将名称拆分成它们的组成部分。

例如,如果我的数据如下所示:

v <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

我们可以看到每个名字都是2个变量的组合。 我试图把它变成一个对称矩阵(对角线为零)。为清楚起见,我想要的输出如下所示:

           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

关于我如何做到这一点有什么建议吗?

编辑

由于其中一个答案强调我的问题过于模糊,我将进行编辑以反映这一点。我正在寻找这个问题的通用解决方案,无论向量中的名称是什么。例如,我的命名向量可能如下所示:

v <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)

我们可以拆分名称,扩展数据以创建缺失的组合 (complete) 并使用 pivot_wider

重塑为宽
library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>% 
   mutate(v = v) %>% 
   complete(V1 = un1, V2 = un1, 
     fill = list(v = 0)) %>% 
   pivot_wider(names_from = V1, values_from = v) %>% 
   column_to_rownames('V2') %>% 
   as.matrix %>% 
   {. + t(.)}

-输出

> out
           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

或使用base R

d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)

-输出

m1
     x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

使用第二个例子

> m1
            apple     banana     orange       pear       plum
apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

选项igraph

library(igraph)

cbind(read.table(text = names(v)), v) %>%
  graph_from_data_frame(directed = FALSE) %>%
  get.adjacency(attr = "v", sparse = FALSE)

给予

           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

基本 R 选项

> d <- read.table(text = names(v))

> xtabs(v ~ ., cbind(rbind(d, setNames(rev(d), names(d))), v = rep(v, 2)))
    V2
V1           x1         x2         x3         x4         x5
  x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
  x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
  x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
  x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
  x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

另一种base R方式:

comb = names(v)
inds = sapply(comb, function(x){
        c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
        simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])


m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))

m[t(inds)]=v
m[t(inds1)]=v

1) 我们使用scan生成顶点v,然后使用嵌套的sapply生成所需的矩阵。没有使用包。

edge2adj <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  sapply(v, function(i) sapply(v, function(j) 
    Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}


# tests where v1 and v2 are the two examples in the question

edge2adj(v1)
##            x1         x2         x3         x4         x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

edge2adj(v2)
##             apple     banana     orange       pear       plum
## apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

2) (1) 可能比这个替代方案更可取,因为它具有更大的普遍性,但我们指出,如果我们知道边是按问题中显示的顺序排列的(按上三角顺序排序)然后我们可以像这样使用 upper.tri 。没有使用包。

edge2adj2 <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  m <- sapply(v, function(i) sapply(v, function(j) 0))
  m[upper.tri(m)] <- e
  m + t(m)
}

identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE

identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE

备注

v1 <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

v2 <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)