Return table 中每个子组的最大值
Return the maximum value of every subgroup in a table
我有一段 table 是这样的:
X Y Value
__ __ __
1 2 6.9
1 3 6.8
1 4 8.1
2 1 7.2
2 3 11.7
2 4 16
3 1 22.6
3 2 20.5
3 3 18.1
… … …
对于具有相同 X
的每组行,我只需要 select 具有最大 Value
的行。我怎样才能生成这样的 table?
X Y Value
__ __ __
1 4 8.1
2 4 16
3 1 22.6
到目前为止我的代码只产生一行:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
[~,maxidx] = max(Value);
T(maxidx,:)
%{
ans =
1×3 table
X Y Value
_ _ _____
3 1 22.6
%}
如果您使用的是 R2015b 或更新版本,您可以使用 splitapply
:
function T2 = q56413455()
% Define some example inputs:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
% Call the "business logic" and assign the output:
T2 = getMaxRows(T);
function out = getMaxRows(T)
GROUPING_VAR = 1; % We assume that the 1st column contains the grouping variable
varnames = T.Properties.VariableNames;
tmp = splitapply(@maxrow, T, T.(varnames{ GROUPING_VAR }));
out = array2table(tmp, 'VariableNames', varnames );
function outrow = maxrow(varargin)
COL_WITH_MAX = 3; % We assume that the 3rd columns is the one that matters for max()
subArr = cell2mat(varargin);
[~,I] = max(subArr(:,COL_WITH_MAX));
outrow = subArr(I,:);
调用它会产生所需的结果:
>> q56413455
ans =
3×3 table
X Y Value
_ _ _____
1 4 8.1
2 4 16
3 1 22.6
另一种变体使用 splitapply(@max, ...)
的 2nd 输出,这是组内最大值的索引。然后我们需要将前面组中的元素数量添加到此(这是使用 diff
完成的):
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
% Get the position of the maximum Value in every group
[~,I] = splitapply(@max, T.Value, T.X); % I == [3; 3; 1]
% Get beginnings of every group
lastGroupEnd = find([1; diff(X)])-1; % lastGroupEnd == [0; 3; 6]
% Offset the maximum positions by group starts to get row indices in the original table
T2 = T(I + lastGroupEnd, :);
使用unique和cumsum
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
values = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,values);
% Find number of categorical X and corresponding category X
[count,category]=hist(X,unique(X));
% Starting offset index of each category, X = 2, offset is 3, X = 3, offset is 6
location = cumsum(count);
maxidx = zeros(1,numel(category));
for i = 1:numel(category)
[~,maxidx(i)] = max(T(T.X == category(i) , :).values);
if i == 1
% First category, no need offset
else
% Locate max index in whole table by adding offset to the initial index
maxidx(i) = maxidx(i) + location(i-1);
end
end
T(maxidx,:)
%{
ans =
3×3 table
X Y values
_ _ ______
1 4 8.1
2 4 16
3 1 22.6
%}
您可以在唯一 X
值上使用循环来做到这一点:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
uniqueX = unique(X); % Get 'X' unique values
maxidx = zeros(size(uniqueX));
for i = 1:length(uniqueX)
xind = find(X == uniqueX(i)); % Find all indices of a unique 'X' value
[~,vind] = max(Value(xind)); % Find index of max Value in 'xind' group
maxidx(i) = xind(vind); % Get the index in the original group
end
T(maxidx,:)
输出:
ans =
3×3 table
X Y Value
_ _ _____
1 4 8.1
2 4 16
3 1 22.6
我有一段 table 是这样的:
X Y Value
__ __ __
1 2 6.9
1 3 6.8
1 4 8.1
2 1 7.2
2 3 11.7
2 4 16
3 1 22.6
3 2 20.5
3 3 18.1
… … …
对于具有相同 X
的每组行,我只需要 select 具有最大 Value
的行。我怎样才能生成这样的 table?
X Y Value
__ __ __
1 4 8.1
2 4 16
3 1 22.6
到目前为止我的代码只产生一行:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
[~,maxidx] = max(Value);
T(maxidx,:)
%{
ans =
1×3 table
X Y Value
_ _ _____
3 1 22.6
%}
如果您使用的是 R2015b 或更新版本,您可以使用 splitapply
:
function T2 = q56413455()
% Define some example inputs:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
% Call the "business logic" and assign the output:
T2 = getMaxRows(T);
function out = getMaxRows(T)
GROUPING_VAR = 1; % We assume that the 1st column contains the grouping variable
varnames = T.Properties.VariableNames;
tmp = splitapply(@maxrow, T, T.(varnames{ GROUPING_VAR }));
out = array2table(tmp, 'VariableNames', varnames );
function outrow = maxrow(varargin)
COL_WITH_MAX = 3; % We assume that the 3rd columns is the one that matters for max()
subArr = cell2mat(varargin);
[~,I] = max(subArr(:,COL_WITH_MAX));
outrow = subArr(I,:);
调用它会产生所需的结果:
>> q56413455
ans =
3×3 table
X Y Value
_ _ _____
1 4 8.1
2 4 16
3 1 22.6
另一种变体使用 splitapply(@max, ...)
的 2nd 输出,这是组内最大值的索引。然后我们需要将前面组中的元素数量添加到此(这是使用 diff
完成的):
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,Value);
% Get the position of the maximum Value in every group
[~,I] = splitapply(@max, T.Value, T.X); % I == [3; 3; 1]
% Get beginnings of every group
lastGroupEnd = find([1; diff(X)])-1; % lastGroupEnd == [0; 3; 6]
% Offset the maximum positions by group starts to get row indices in the original table
T2 = T(I + lastGroupEnd, :);
使用unique和cumsum
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
values = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
T = table(X,Y,values);
% Find number of categorical X and corresponding category X
[count,category]=hist(X,unique(X));
% Starting offset index of each category, X = 2, offset is 3, X = 3, offset is 6
location = cumsum(count);
maxidx = zeros(1,numel(category));
for i = 1:numel(category)
[~,maxidx(i)] = max(T(T.X == category(i) , :).values);
if i == 1
% First category, no need offset
else
% Locate max index in whole table by adding offset to the initial index
maxidx(i) = maxidx(i) + location(i-1);
end
end
T(maxidx,:)
%{
ans =
3×3 table
X Y values
_ _ ______
1 4 8.1
2 4 16
3 1 22.6
%}
您可以在唯一 X
值上使用循环来做到这一点:
X = [1; 1; 1; 2; 2; 2; 3; 3; 3];
Y = [2; 3; 4; 1; 3; 4; 1; 2; 4];
Value = [6.9; 6.8; 8.1;7.2;11.7;16;22.6;20.5;18.1];
uniqueX = unique(X); % Get 'X' unique values
maxidx = zeros(size(uniqueX));
for i = 1:length(uniqueX)
xind = find(X == uniqueX(i)); % Find all indices of a unique 'X' value
[~,vind] = max(Value(xind)); % Find index of max Value in 'xind' group
maxidx(i) = xind(vind); % Get the index in the original group
end
T(maxidx,:)
输出:
ans =
3×3 table
X Y Value
_ _ _____
1 4 8.1
2 4 16
3 1 22.6