R输出矩阵索引与数据帧中的值

R output matrix index with values in dataframe

我试图从 Position 列中的数据框值中找到“矩阵索引”。我想引用的“矩阵”是 3 x 3 或 4 x 4 矩阵,具体取决于 Position 列的长度(1:9 用于 3 x 3 和 1:16 为 4 x 4)。 col1 中的不同组会有不同的长度 Position

这是一个虚拟数据框来演示我的问题。

df <- structure(list(col1 = c("group1", "group1", "group1", "group1", 
"group1", "group1", "group1", "group1", "group1", "group2", "group2", 
"group2", "group2", "group2", "group2", "group2", "group2", "group2", 
"group2", "group2", "group2", "group2", "group2", "group2", "group2", 
"group3", "group3", "group3", "group3", "group3", "group3", "group3", 
"group3", "group3", "group3", "group3", "group3", "group3"), 
    col2 = c("A", "Q", NA, "A", "K", "L", "O", "R", "J", "S", 
    "C", "S", "H", "O", "T", "Z", "D", "Y", "J", "V", "Z", "P", 
    "L", "X", "D", "K", "M", "X", "E", "P", "U", "Z", "Z", "L", 
    "W", "X", "F", "K"), Position = c(1L, 2L, 3L, 4L, 5L, 6L, 
    7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
    12L, 13L, 14L, 15L, 16L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
    9L, 10L, 11L, 12L, 13L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -38L))

规则

从这个数据框中,我想得到一个新列 Position_ij 指定 ithjth Position 如果它在矩阵中。

例如,"group1" 有一个长度为 9 的 Position,因此,它应该引用一个 3 x 3 矩阵,并且 Position_ij 应该是 1 = "[1, 1] ", 2 = "[1, 2]", 3 = "[1, 3]", 4 = "[2, 1]" ..., 9 = "[3, 3]".

对于 "group2",它的 Position 长度为 16,因此它应该引用一个 4 x 4 矩阵,并且 Position_ij 应该是 1 = "[1, 1]" , ..., 4 = "[1, 4]", 5 = "[2, 1]" ..., 16 = "[4, 4]".

对于“group3”,它的Position长度为 13,大于 9,因此它应该引用一个 4 x 4 矩阵。

当前尝试(失败)

我目前的方法使用%/%%%得到Position除以矩阵长度的商和余数,然而,当Position==矩阵长度时,余数是 0,但我想要 3 或 4。

library(dplyr)

df %>% group_by(col1) %>% 
  mutate(Position_ij = if (n() == 9) {
      paste0("[", (Position %/% 3) + 1, ", ", Position %% 3, "]")
    } else {
      paste0("[", (Position %/% 4) + 1, ", ", Position %% 4, "]")
      })
# A tibble: 38 × 4
# Groups:   col1 [3]
   col1   col2  Position Position_ij
   <chr>  <chr>    <int> <chr>      
 1 group1 A            1 [1, 1]     
 2 group1 Q            2 [1, 2]     
 3 group1 NA           3 [2, 0]     # this should be [1, 3]
 4 group1 A            4 [2, 1]     
 5 group1 K            5 [2, 2]     
 6 group1 L            6 [3, 0]     # this should be [2, 3]
 7 group1 O            7 [3, 1]     
 8 group1 R            8 [3, 2]     
 9 group1 J            9 [4, 0]     # this should be [3, 3]
10 group2 S            1 [1, 1]     
# … with 28 more rows

期望的输出

   col1   col2  Position Position_ij
   <chr>  <chr>    <int> <chr>      
 1 group1 A            1 [1, 1]     
 2 group1 Q            2 [1, 2]     
 3 group1 NA           3 [1, 3]     
 4 group1 A            4 [2, 1]     
 5 group1 K            5 [2, 2]     
 6 group1 L            6 [2, 3]     
 7 group1 O            7 [3, 1]     
 8 group1 R            8 [3, 2]     
 9 group1 J            9 [3, 3]     
10 group2 S            1 [1, 1]     
11 group2 C            2 [1, 2]     
12 group2 S            3 [1, 3]     
13 group2 H            4 [1, 4]     
14 group2 O            5 [2, 1]     
15 group2 T            6 [2, 2]     
16 group2 Z            7 [2, 3]     
17 group2 D            8 [2, 4]     
18 group2 Y            9 [3, 1]     
19 group2 J           10 [3, 2]     
20 group2 V           11 [3, 3]     
21 group2 Z           12 [3, 4]     
22 group2 P           13 [4, 1]     
23 group2 L           14 [4, 2]     
24 group2 X           15 [4, 3]     
25 group2 D           16 [4, 4]     
26 group3 K            1 [1, 1]     
27 group3 M            2 [1, 2]     
28 group3 X            3 [1, 3]     
29 group3 E            4 [1, 4]     
30 group3 P            5 [2, 1]     
31 group3 U            6 [2, 2]     
32 group3 Z            7 [2, 3]     
33 group3 Z            8 [2, 4]     
34 group3 L            9 [3, 1]     
35 group3 W           10 [3, 2]     
36 group3 X           11 [3, 3]     
37 group3 F           12 [3, 4]     
38 group3 K           13 [4, 1]        

仅供参考,我的参考矩阵实际上应该是 9 x 9 或 10 x 10。

%/%/%%前的'Position'减1,结果

加1
library(dplyr)
out <- df %>% 
  group_by(col1) %>% 
  mutate(Position_ij = if (n() == 9) {
      paste0("[", ((Position-1) %/% 3) + 1, ", ", (Position-1) %% 3 + 1, "]")
    } else {
      paste0("[", ((Position-1) %/% 4) + 1, ", ", (Position-1) %% 4 + 1, "]")
      }) %>%
  ungroup

-输出

> as.data.frame(out)
     col1 col2 Position Position_ij
1  group1    A        1      [1, 1]
2  group1    Q        2      [1, 2]
3  group1 <NA>        3      [1, 3]
4  group1    A        4      [2, 1]
5  group1    K        5      [2, 2]
6  group1    L        6      [2, 3]
7  group1    O        7      [3, 1]
8  group1    R        8      [3, 2]
9  group1    J        9      [3, 3]
10 group2    S        1      [1, 1]
11 group2    C        2      [1, 2]
12 group2    S        3      [1, 3]
13 group2    H        4      [1, 4]
14 group2    O        5      [2, 1]
15 group2    T        6      [2, 2]
16 group2    Z        7      [2, 3]
17 group2    D        8      [2, 4]
18 group2    Y        9      [3, 1]
19 group2    J       10      [3, 2]
20 group2    V       11      [3, 3]
21 group2    Z       12      [3, 4]
22 group2    P       13      [4, 1]
23 group2    L       14      [4, 2]
24 group2    X       15      [4, 3]
25 group2    D       16      [4, 4]
26 group3    K        1      [1, 1]
27 group3    M        2      [1, 2]
28 group3    X        3      [1, 3]
29 group3    E        4      [1, 4]
30 group3    P        5      [2, 1]
31 group3    U        6      [2, 2]
32 group3    Z        7      [2, 3]
33 group3    Z        8      [2, 4]
34 group3    L        9      [3, 1]
35 group3    W       10      [3, 2]
36 group3    X       11      [3, 3]
37 group3    F       12      [3, 4]
38 group3    K       13      [4, 1]

或使用gl/rowid

library(data.table)
out2 <- df %>%
   group_by(col1) %>% 
   mutate(Position_ij = sprintf('[%d, %d]', 
      as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n())),
      rowid(as.integer(gl(n(), c(4, 3)[1 + !n()%%3], n()))))) %>% 
   ungroup

-测试

> identical(out2, out)