在 MATLAB 中存储向量的唯一 strings/elements 索引

Question

我有三个相同大小的向量 A、B 和 C。 A 和 B 是数值量，而 C 是字符串元胞数组。我想根据B和C的唯一值生成A的子向量。例如让

A = [0.45 0.89 0.12 0.35 0.40 0.93 0.12 0.35 0.72 0.59];
B = [1 1 3 1 8 1 8 8 1 1];
C = [{'Tom'}, {'Mary'}, {'Dick'}, {'Harry'}, {'Dick'}, {'Tom'}, {'Tom'}, {'Mary'}, {'Tom'}, {'Mary'}];

所以，我首先尝试使用 MATLAB 的 unique 函数找到 B 和 C 的唯一值。结果如下：

unique(B)
ans =
 1    3     8
unique(C)
ans = 
'Dick'    'Harry'    'Mary'    'Tom'

现在，我想根据 B 和 C 中的唯一值排列向量 A。假设 unique(B) 的第一个元素的唯一值的索引是 [row_b1, colb1]，第二个元素索引为 [row_b2, colb2] 等等。然后，根据 B 中的唯一值，我想生成以下三个向量：

A_B1 = A(rowb1, colb1);
A_B2 = A(rowb2, colb2);
A_B3 = A(rowb3, colb3);

同样，基于 C 的唯一值，我想生成以下向量：

A_C1 = A(rowc1, colc1);
A_C2 = A(rowc2, colc2);
A_C3 = A(rowc3, colc3);
A_C4 = A(rowc4, colc4);

但是，我在 unique 命令中没有找到任何告诉我向量中唯一值索引的信息，即 row_b1、col_b1、...等等。有人可以吗指出如何获得这些指标？任何帮助将不胜感激。

Edit1：我应该提到我发布了一个一般案例。但是在我的问题A中，B和C是矩阵。我想如果我能解决一个向量的问题，那么它应该是一个类似的矩阵扩展。

Edit2：这是我期待的答案：

A_B1 = [0.45 0.89 0.35 0.93 0.72 0.59]
A_B2 = [0.12]
A_B3 = [0.40 0.12 0.35]

和

A_C1 = [0.45 0.93 0.12 0.72]
A_C2 = [0.89 0.35 0.59]
A_C3 = [0.12 0.40]
A_C4 = [0.35]

Answer 1

让我们考虑同样的例子。

A = [0.45 0.89 0.12 0.35 0.40 0.93 0.12 0.35 0.72 0.59];
B = [1 1 3 1 8 1 8 8 1 1];
C = [{'Tom'}, {'Mary'}, {'Dick'}, {'Harry'}, {'Dick'}, {'Tom'}, {'Tom'}, {'Mary'}, {'Tom'}, {'Mary'}];

[uniqueB,indB,~]=unique(B);
[uniqueC,indC,~]=unique(C);
A_indB=A(indB); //This and the following line is what you want, I think.
A_indC=A(indC);

这在 unique 的文档中给出。

编辑：

这就是我对矩阵（数字和字符）的处理方式。

首先生成一些矩阵。

A1=[A;A(randperm(length(A)));A(randperm(length(A)))];
B1=[B;B(randperm(length(B)));B(randperm(length(B)))];
C1=[C;C(randperm(length(C)));C(randperm(length(C)))];

现在，

indB=arrayfun(@(x) bsxfun(@eq,B1,x),unique(B1),'uni',0);
indC=arrayfun(@(x) strcmp(C1,x),unique(C1),'uni',0); % above line only works 
                                                     % for numeric arrays

并得到答案，

A_B1=A(indB{1,1});
A_B2=A(indB{2,1});
A_C1=A(indC{1,1});
A_C2=A(indC{2,1}); % and so on

Answer 2

您可以使用 unique 函数的额外输出：

[ub ib jb]=unique(B);

其中ub、ib和jb定义为：

ub = 唯一（B）；
B(ib) = ub;
ub(jb) = B;

您可以使用它来获取您的 A_B 值：

A_B=A(ib);

同样的逻辑适用于 C:

[cu ic jc]=unique(C);
A_C=A(ic);

因为 A、B 和 C 是相同大小的向量。

编辑

因为（rowb1，colb1），（rowb2，colb2）和（rowb3，colb3）只是注释中解释的占位符，你真正想要的是每个唯一值出现的索引并使用这些索引来获取你的 A 值。要获得每个唯一值出现的位置，您可以使用：

>> idxA_B=bsxfun(@eq,B,unique(B)')

ans =

     1     1     0     1     0     1     0     0     1     1
     0     0     1     0     0     0     0     0     0     0
     0     0     0     0     1     0     1     1     0     0

因此您的 A_B 值由以下公式给出：

A_B1=A(idxA_B(1,:));
A_B2=A(idxA_B(2,:));
A_B3=A(idxA_B(3,:));

同样的逻辑适用于 C。但正如@Parag S. Chandakkar 指出的那样，@eq 不适用于字符串，因此您需要通过以下方式获取 C 索引：

aux=cellfun(@(x) strcmp(x,unique(C)'),C,'UniformOutput',false);
idxA_C = [aux{:}];
A_C1=A(idxA_C(1,:));
A_C2=A(idxA_C(2,:));
A_C3=A(idxA_C(3,:));
A_C4=A(idxA_C(4,:));

在 MATLAB 中存储向量的唯一 strings/elements 索引

Storing indices of unique strings/elements of a vector in MATLAB

matlab

vector

unique

编辑：