将命名向量转换为 R 中的对称矩阵?
Turn a named vector into a symmetric matrix in R?
有人问过类似的问题,但是,none 有必须拆分向量名称的附加元素,所以我要问一个新问题。
我正在尝试将命名向量转换为 R 中的对称矩阵。我的向量包含矩阵中每个值组合的名称。所以我需要将名称拆分成它们的组成部分。
例如,如果我的数据如下所示:
v <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
我们可以看到每个名字都是2个变量的组合。
我试图把它变成一个对称矩阵(对角线为零)。为清楚起见,我想要的输出如下所示:
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
关于我如何做到这一点有什么建议吗?
编辑
由于其中一个答案强调我的问题过于模糊,我将进行编辑以反映这一点。我正在寻找这个问题的通用解决方案,无论向量中的名称是什么。例如,我的命名向量可能如下所示:
v <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
我们可以拆分名称,扩展数据以创建缺失的组合 (complete
) 并使用 pivot_wider
重塑为宽
library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>%
mutate(v = v) %>%
complete(V1 = un1, V2 = un1,
fill = list(v = 0)) %>%
pivot_wider(names_from = V1, values_from = v) %>%
column_to_rownames('V2') %>%
as.matrix %>%
{. + t(.)}
-输出
> out
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
或使用base R
d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)
-输出
m1
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
使用第二个例子
> m1
apple banana orange pear plum
apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
选项igraph
library(igraph)
cbind(read.table(text = names(v)), v) %>%
graph_from_data_frame(directed = FALSE) %>%
get.adjacency(attr = "v", sparse = FALSE)
给予
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
基本 R 选项
> d <- read.table(text = names(v))
> xtabs(v ~ ., cbind(rbind(d, setNames(rev(d), names(d))), v = rep(v, 2)))
V2
V1 x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
另一种base R
方式:
comb = names(v)
inds = sapply(comb, function(x){
c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])
m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))
m[t(inds)]=v
m[t(inds1)]=v
1) 我们使用scan
生成顶点v
,然后使用嵌套的sapply
生成所需的矩阵。没有使用包。
edge2adj <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
sapply(v, function(i) sapply(v, function(j)
Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}
# tests where v1 and v2 are the two examples in the question
edge2adj(v1)
## x1 x2 x3 x4 x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
edge2adj(v2)
## apple banana orange pear plum
## apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
2) (1) 可能比这个替代方案更可取,因为它具有更大的普遍性,但我们指出,如果我们知道边是按问题中显示的顺序排列的(按上三角顺序排序)然后我们可以像这样使用 upper.tri
。没有使用包。
edge2adj2 <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
m <- sapply(v, function(i) sapply(v, function(j) 0))
m[upper.tri(m)] <- e
m + t(m)
}
identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE
identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE
备注
v1 <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
v2 <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
有人问过类似的问题,但是,none 有必须拆分向量名称的附加元素,所以我要问一个新问题。
我正在尝试将命名向量转换为 R 中的对称矩阵。我的向量包含矩阵中每个值组合的名称。所以我需要将名称拆分成它们的组成部分。
例如,如果我的数据如下所示:
v <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
我们可以看到每个名字都是2个变量的组合。 我试图把它变成一个对称矩阵(对角线为零)。为清楚起见,我想要的输出如下所示:
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
关于我如何做到这一点有什么建议吗?
编辑
由于其中一个答案强调我的问题过于模糊,我将进行编辑以反映这一点。我正在寻找这个问题的通用解决方案,无论向量中的名称是什么。例如,我的命名向量可能如下所示:
v <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
我们可以拆分名称,扩展数据以创建缺失的组合 (complete
) 并使用 pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>%
mutate(v = v) %>%
complete(V1 = un1, V2 = un1,
fill = list(v = 0)) %>%
pivot_wider(names_from = V1, values_from = v) %>%
column_to_rownames('V2') %>%
as.matrix %>%
{. + t(.)}
-输出
> out
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
或使用base R
d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)
-输出
m1
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
使用第二个例子
> m1
apple banana orange pear plum
apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
选项igraph
library(igraph)
cbind(read.table(text = names(v)), v) %>%
graph_from_data_frame(directed = FALSE) %>%
get.adjacency(attr = "v", sparse = FALSE)
给予
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
基本 R 选项
> d <- read.table(text = names(v))
> xtabs(v ~ ., cbind(rbind(d, setNames(rev(d), names(d))), v = rep(v, 2)))
V2
V1 x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
另一种base R
方式:
comb = names(v)
inds = sapply(comb, function(x){
c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])
m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))
m[t(inds)]=v
m[t(inds1)]=v
1) 我们使用scan
生成顶点v
,然后使用嵌套的sapply
生成所需的矩阵。没有使用包。
edge2adj <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
sapply(v, function(i) sapply(v, function(j)
Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}
# tests where v1 and v2 are the two examples in the question
edge2adj(v1)
## x1 x2 x3 x4 x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
edge2adj(v2)
## apple banana orange pear plum
## apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
2) (1) 可能比这个替代方案更可取,因为它具有更大的普遍性,但我们指出,如果我们知道边是按问题中显示的顺序排列的(按上三角顺序排序)然后我们可以像这样使用 upper.tri
。没有使用包。
edge2adj2 <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
m <- sapply(v, function(i) sapply(v, function(j) 0))
m[upper.tri(m)] <- e
m + t(m)
}
identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE
identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE
备注
v1 <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
v2 <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)