通过匹配 r 中的映射结构来修改字符变量

modify a character variable by matching a mapping structure in r

我需要根据嵌入对角线序号的映射数据框修改字符变量。

这是我的三个矩阵的样子。

# for Group 1
Group.1 <- c(11,12,13,14,15)
diag <- rep("Free",length(Group.1)+1)
offdiag <- rep("0.0", (length(Group.1)+1)*length(Group.1)/2 )
m1 <- matrix(NA, ncol = length(diag), nrow = length(diag))
m1[lower.tri(m1)] <- offdiag
m1[upper.tri(m1)] <- t(m1)[upper.tri(t(m1))]
diag(m1) <- diag
m1[upper.tri(m1)] <- NA

> m1
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]  
[1,] "Free" NA     NA     NA     NA     NA    
[2,] "0.0"  "Free" NA     NA     NA     NA    
[3,] "0.0"  "0.0"  "Free" NA     NA     NA    
[4,] "0.0"  "0.0"  "0.0"  "Free" NA     NA    
[5,] "0.0"  "0.0"  "0.0"  "0.0"  "Free" NA    
[6,] "0.0"  "0.0"  "0.0"  "0.0"  "0.0"  "Free"

inds.1 <- which(na.omit(c(t(m1))) == 'Free')[-1] - 1
> inds.1
[1]  2  5  9 14 20

inds.1inds.2inds.3存储每个矩阵的Free的序号。请注意,第一个 Free 的顺序是 0。这就是为什么第二个 Free 的订单号是 2 而不是 3。第三个 Free 的订单号为 5 以此类推。

# for Group 2
Group.2 <- c(11,13,15)
diag <- rep("Free",length(Group.2)+1)
offdiag <- rep("0.0", (length(Group.2)+1)*length(Group.2)/2 )
m2 <- matrix(NA, ncol = length(diag), nrow = length(diag))
m2[lower.tri(m2)] <- offdiag
m2[upper.tri(m2)] <- t(m2)[upper.tri(t(m2))]
diag(m2) <- diag
m2[upper.tri(m2)] <- NA

inds.2 <- which(na.omit(c(t(m2))) == 'Free')[-1] - 1
> inds.2
[1] 2 5 9

# for Group 3
Group.3 <- c(12,13,14)
diag <- rep("Free",length(Group.3)+1)
offdiag <- rep("0.0", (length(Group.3)+1)*length(Group.3)/2 )
m3 <- matrix(NA, ncol = length(diag), nrow = length(diag))
m3[lower.tri(m3)] <- offdiag
m3[upper.tri(m3)] <- t(m3)[upper.tri(t(m3))]
diag(m3) <- diag
m3[upper.tri(m3)] <- NA

inds.3 <- which(na.omit(c(t(m3))) == 'Free')[-1] - 1
> inds.3
[1] 2 5 9

# create grouping map
map.1 <- as.data.frame(cbind(Items=Group.1, Group.1 = 1))
map.2 <- as.data.frame(cbind(Items=Group.2, Group.2 = 1))
map.3 <- as.data.frame(cbind(Items=Group.3, Group.3 = 1))

group.map.12 <- merge(map.1, map.2, by="Items", all = TRUE)
group.map.all <- merge(group.map.12, map.3, by="Items", all = TRUE)
group.map.all[is.na(group.map.all)] <- 0

> group.map.all
  Items Group.1 Group.2 Group.3
1    11       1       1       0
2    12       1       0       1
3    13       1       1       1
4    14       1       0       1
5    15       1       1       0

这张地图告诉我们哪个项目属于哪个组。

基于此信息,我能够创建一个输出,但我需要对此进行一些修改。:

output <- c("Equal = (G1, 11, Covariance[X]), (G2, 11, Covariance[X]);",                         
"Equal = (G1, 12, Covariance[X]), (G3, 12, Covariance[X]);",                         
"Equal = (G1, 13, Covariance[X]), (G2, 13, Covariance[X]), (G3, 13, Covariance[X]);",
"Equal = (G1, 14, Covariance[X]), (G3, 14, Covariance[X]);",                         
"Equal = (G1, 15, Covariance[X]), (G2, 15, Covariance[X]);") 

> output
[1] "Equal = (G1, 11, Covariance[X]), (G2, 11, Covariance[X]);"                         
[2] "Equal = (G1, 12, Covariance[X]), (G3, 12, Covariance[X]);"                         
[3] "Equal = (G1, 13, Covariance[X]), (G2, 13, Covariance[X]), (G3, 13, Covariance[X]);"
[4] "Equal = (G1, 14, Covariance[X]), (G3, 14, Covariance[X]);"                         
[5] "Equal = (G1, 15, Covariance[X]), (G2, 15, Covariance[X]);"   

在此输出中修改两件事。

  1. 删除带逗号的数字(因此 11, 12, 等需要删除。
  2. 对于输出中的每一行,需要编辑 [X] 以指示匹配的矩阵阶数。例如,从映射中,Item=12G1G3 中,在第 1 组中,矩阵序号是 5 来自 inds.1 对象。在第 2 组中,此 Item=12inds.1 对象的矩阵阶数是 2。我需要将这些数字嵌入 [X}.

所需的输出将是:

Equal=(G1,Covariance[2]),(G2,Covariance[2]);
Equal=(G1,Covariance[5]),(G2,Covariance[2]);
Equal=(G1,Covariance[9]),(G2,Covariance[5]),(G3,Covariance[5]);
Equal=(G1,Covariance[14]),(G3,Covariance[9]);
Equal=(G1,Covariance[20]),(G2,Covariance[9]);

有什么想法吗? 谢谢!

也许这有帮助 -

  1. 循环across'Group'列,replace二进制到逻辑转换'inds'个对象的相应值
  2. 重塑为长格式 -pivot_longer
  3. 删除'value'列为0的行 -filter
  4. 从 'name' 列中删除子字符串 -str_remove
  5. 从 'name'、'value' - sprintf
  6. 创建一个新的格式化列
  7. 分组 'Items',paste 'new' 列 -str_c
  8. 将列提取为向量 -pull
library(stringr)
library(dplyr)
library(tidyr)
group.map.all %>%
    mutate(across(starts_with('Group'),
     ~ replace(., as.logical(.), 
        get(str_replace(cur_column(), "Group", "inds"))))) %>% 
    pivot_longer(cols = -Items) %>%
    filter(value != 0) %>% 
    mutate(name = str_remove(name, "[a-z.]+")) %>%
    summarise(Items, new = sprintf('(%s,Covariance[%d])', name, value)) %>%  
    group_by(Items) %>% 
    summarise(new = str_c('Equal=',str_c(new, collapse=","), ";")) %>%
    pull(new)

-输出

[1] "Equal=(G1,Covariance[2]),(G2,Covariance[2]);"  
[2] "Equal=(G1,Covariance[5]),(G3,Covariance[2]);"                   
[3] "Equal=(G1,Covariance[9]),(G2,Covariance[5]),(G3,Covariance[5]);"
[4] "Equal=(G1,Covariance[14]),(G3,Covariance[9]);"                  
[5] "Equal=(G1,Covariance[20]),(G2,Covariance[9]);"