SQL Matlab 中带 having 子句的 group-by 语句
SQL group-by statement with having-clause in Matlab
我在 Matlab
中有两个表,我想根据以下 SQL
语句合并,'Returns' 和'Yearly'。如何将它们合并到 Matlab
? (我必须使用 Matlab)
select a.*, b.Equity, b.Date as Yearly_date from Returns a, Yearly b where a.Id = b.Id and a.Date >= b.Date group by a.Id, a.Date having max(b.Date) = b.Date
这是一些示例数据:
Returns = table([repmat(1,5,1);repmat(2,6,1)],[(datetime(2013,10,31):calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Returns =
Id Date Return
__ ___________ ________
1 31-Oct-2013 -0.8095
1 30-Nov-2013 -2.9443
1 31-Dec-2013 1.4384
1 31-Jan-2014 0.32519
1 28-Feb-2014 -0.75493
2 31-Oct-2013 1.3703
2 30-Nov-2013 -1.7115
2 31-Dec-2013 -0.10224
2 31-Jan-2014 -0.24145
2 28-Feb-2014 0.31921
2 31-Mar-2014 0.31286
Yearly = table([repmat(1,3,1);repmat(2,2,1)],[(datetime(2011,12,31):calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
Yearly =
Id Date Equity
__ ___________ ______
1 31-Dec-2011 8
1 31-Dec-2012 10
1 31-Dec-2013 11
2 31-Dec-2012 30
2 31-Dec-2013 28
我想要以下输出:
ans =
Id Date Return Equity Yearly_date
__ ___________ __________ ______ ___________
1 31-Oct-2013 -0.86488 10 31-Dec-2012
1 30-Nov-2013 -0.030051 10 31-Dec-2012
1 31-Dec-2013 -0.16488 11 31-Dec-2013
1 31-Jan-2014 0.62771 11 31-Dec-2013
1 28-Feb-2014 1.0933 11 31-Dec-2013
2 31-Oct-2013 1.1093 30 31-Dec-2012
2 30-Nov-2013 -0.86365 30 31-Dec-2012
2 31-Dec-2013 0.077359 28 31-Dec-2013
2 31-Jan-2014 -1.2141 28 31-Dec-2013
2 28-Feb-2014 -1.1135 28 31-Dec-2013
2 31-Mar-2014 -0.0068493 28 31-Dec-2013
这是另一个基于 bsxfun
的解决方案,滥用其 屏蔽功能 -
%// Inputs
Returns = table([repmat(1,5,1);repmat(2,6,1)],[(datetime(2013,10,31):...
calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):...
datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Yearly = table([repmat(1,3,1);repmat(2,2,1)],[(datetime(2011,12,31):...
calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):...
datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
%// Get mask of matches for each ID in Returns against each ID in Yearly
matches = bsxfun(@ge,datenum(Returns.Date),datenum(Yearly.Date)'); %//'
%// Keep the matches within the respective Ids only
matches(~bsxfun(@ge,Returns.Id,Yearly.Id'))=0; %//'# Or matches(bsxfun(@lt,..)
%// Get the ID (column -ID) of the last match for each Id in Returns
[~,flipped_col_ID] = max(matches(:,end:-1:1),[],2);
col_ID = size(matches,2) - flipped_col_ID + 1;
%// Select the rows from Yearly based on col IDs and create the output table
out = [Returns table(Yearly.Equity(col_ID), Yearly.Date(col_ID))]
代码运行-
Returns =
Id Date Return
__ ___________ ________
1 31-Oct-2013 0.045158
1 30-Nov-2013 0.071319
1 31-Dec-2013 0.52357
1 31-Jan-2014 -0.65424
1 28-Feb-2014 1.8452
2 31-Oct-2013 0.037262
2 30-Nov-2013 0.38369
2 31-Dec-2013 1.1972
2 31-Jan-2014 -0.54708
2 28-Feb-2014 -0.15706
2 31-Mar-2014 0.11882
Yearly =
Id Date Equity
__ ___________ ______
1 31-Dec-2011 8
1 31-Dec-2012 10
1 31-Dec-2013 11
2 31-Dec-2012 30
2 31-Dec-2013 28
out =
Id Date Return Var1 Var2
__ ___________ ________ ____ ___________
1 31-Oct-2013 0.045158 10 31-Dec-2012
1 30-Nov-2013 0.071319 10 31-Dec-2012
1 31-Dec-2013 0.52357 11 31-Dec-2013
1 31-Jan-2014 -0.65424 11 31-Dec-2013
1 28-Feb-2014 1.8452 11 31-Dec-2013
2 31-Oct-2013 0.037262 30 31-Dec-2012
2 30-Nov-2013 0.38369 30 31-Dec-2012
2 31-Dec-2013 1.1972 28 31-Dec-2013
2 31-Jan-2014 -0.54708 28 31-Dec-2013
2 28-Feb-2014 -0.15706 28 31-Dec-2013
2 31-Mar-2014 0.11882 28 31-Dec-2013
通用案例解决方案
对于某些情况,当 Ids
可能是非数字且 dates
尚未排序时,您可以尝试以下代码 -
%// Inputs
Returns = table([repmat('Id1',5,1);repmat('Id2',6,1)],[(datetime(2013,10,31):...
calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):...
datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Yearly = table([repmat('Id1',3,1);repmat('Id2',2,1)],[(datetime(2011,12,31):...
calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):...
datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
%// -- Convert strings based Ids into numeric ones
alltypes = cellstr([Returns.Id ; Yearly.Id]);
[~,~,IDs] = unique(alltypes,'stable');
lbls_len = size(Returns.Id,1);
Returns_Id = IDs(1:lbls_len);
Yearly_Id = IDs(lbls_len+1:end);
%// Get Returns and Yearly Dates
Returns_Date = datenum(Returns.Date);
Yearly_Date = datenum(Yearly.Date);
%// Sort the dates if not already sorted
y1 = arrayfun(@(n) sort(Returns_Date(Returns_Id==n)),1:max(Returns_Id),'Uni',0);
Returns_Date = vertcat(y1{:});
y2 = arrayfun(@(n) sort(Yearly_Date(Yearly_Id==n)),1:max(Yearly_Id),'Uni',0);
Yearly_Date = vertcat(y2{:});
%// Counts of Ids to be used as boundaries when saving output at each
%// iteration correspondin to each ID
Yearly_Id_counts = [0 ; histc(Yearly_Id,1:max(Yearly_Id))];
Returns_Id_counts = histc(Returns_Id,1:max(Returns_Id));
%// Initializations
stop = 0;
col_ID = zeros(size(Returns_Date,1),1);
for iter = 1:max(Returns_Id)
%// Get mask of matches for each ID in Returns against each ID in Yearly
matches = bsxfun(@ge,Returns_Date(Returns_Id==iter),...
Yearly_Date(Yearly_Id==iter)'); %//'
%// Get the ID (column -ID) of the last match for each Id in Returns
[~,flipped_col_ID] = max(matches(:,end:-1:1),[],2);
%// Get start and stop for indexing into output column IDs array
start = stop + 1;
stop = start + Returns_Id_counts(iter) - 1;
%// Get the columns IDs to be used for indexing into Yearly data for
%// getting the final output
col_ID(start:stop) = Yearly_Id_counts(iter) + ...
Yearly_Id_counts(iter + 1) - flipped_col_ID + 1;
end
%// Select the rows from Yearly based on col IDs and create the output table
out = [Returns table(Yearly.Equity(col_ID), Yearly.Date(col_ID))]
代码运行-
Returns =
Id Date Return
___ ___________ ________
Id1 31-Oct-2013 0.53767
Id1 30-Nov-2013 1.8339
Id1 31-Dec-2013 -2.2588
Id1 31-Jan-2014 0.86217
Id1 28-Feb-2014 0.31877
Id2 31-Oct-2013 -1.3077
Id2 30-Nov-2013 -0.43359
Id2 31-Dec-2013 0.34262
Id2 31-Jan-2014 3.5784
Id2 28-Feb-2014 2.7694
Id2 31-Mar-2014 -1.3499
Yearly =
Id Date Equity
___ ___________ ______
Id1 31-Dec-2011 8
Id1 31-Dec-2012 10
Id1 31-Dec-2013 11
Id2 31-Dec-2012 30
Id2 31-Dec-2013 28
out =
Id Date Return Var1 Var2
___ ___________ ________ ____ ___________
Id1 31-Oct-2013 0.53767 10 31-Dec-2012
Id1 30-Nov-2013 1.8339 10 31-Dec-2012
Id1 31-Dec-2013 -2.2588 11 31-Dec-2013
Id1 31-Jan-2014 0.86217 11 31-Dec-2013
Id1 28-Feb-2014 0.31877 11 31-Dec-2013
Id2 31-Oct-2013 -1.3077 30 31-Dec-2012
Id2 30-Nov-2013 -0.43359 30 31-Dec-2012
Id2 31-Dec-2013 0.34262 28 31-Dec-2013
Id2 31-Jan-2014 3.5784 28 31-Dec-2013
Id2 28-Feb-2014 2.7694 28 31-Dec-2013
我在 Matlab
中有两个表,我想根据以下 SQL
语句合并,'Returns' 和'Yearly'。如何将它们合并到 Matlab
? (我必须使用 Matlab)
select a.*, b.Equity, b.Date as Yearly_date from Returns a, Yearly b where a.Id = b.Id and a.Date >= b.Date group by a.Id, a.Date having max(b.Date) = b.Date
这是一些示例数据:
Returns = table([repmat(1,5,1);repmat(2,6,1)],[(datetime(2013,10,31):calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Returns =
Id Date Return
__ ___________ ________
1 31-Oct-2013 -0.8095
1 30-Nov-2013 -2.9443
1 31-Dec-2013 1.4384
1 31-Jan-2014 0.32519
1 28-Feb-2014 -0.75493
2 31-Oct-2013 1.3703
2 30-Nov-2013 -1.7115
2 31-Dec-2013 -0.10224
2 31-Jan-2014 -0.24145
2 28-Feb-2014 0.31921
2 31-Mar-2014 0.31286
Yearly = table([repmat(1,3,1);repmat(2,2,1)],[(datetime(2011,12,31):calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
Yearly =
Id Date Equity
__ ___________ ______
1 31-Dec-2011 8
1 31-Dec-2012 10
1 31-Dec-2013 11
2 31-Dec-2012 30
2 31-Dec-2013 28
我想要以下输出:
ans =
Id Date Return Equity Yearly_date
__ ___________ __________ ______ ___________
1 31-Oct-2013 -0.86488 10 31-Dec-2012
1 30-Nov-2013 -0.030051 10 31-Dec-2012
1 31-Dec-2013 -0.16488 11 31-Dec-2013
1 31-Jan-2014 0.62771 11 31-Dec-2013
1 28-Feb-2014 1.0933 11 31-Dec-2013
2 31-Oct-2013 1.1093 30 31-Dec-2012
2 30-Nov-2013 -0.86365 30 31-Dec-2012
2 31-Dec-2013 0.077359 28 31-Dec-2013
2 31-Jan-2014 -1.2141 28 31-Dec-2013
2 28-Feb-2014 -1.1135 28 31-Dec-2013
2 31-Mar-2014 -0.0068493 28 31-Dec-2013
这是另一个基于 bsxfun
的解决方案,滥用其 屏蔽功能 -
%// Inputs
Returns = table([repmat(1,5,1);repmat(2,6,1)],[(datetime(2013,10,31):...
calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):...
datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Yearly = table([repmat(1,3,1);repmat(2,2,1)],[(datetime(2011,12,31):...
calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):...
datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
%// Get mask of matches for each ID in Returns against each ID in Yearly
matches = bsxfun(@ge,datenum(Returns.Date),datenum(Yearly.Date)'); %//'
%// Keep the matches within the respective Ids only
matches(~bsxfun(@ge,Returns.Id,Yearly.Id'))=0; %//'# Or matches(bsxfun(@lt,..)
%// Get the ID (column -ID) of the last match for each Id in Returns
[~,flipped_col_ID] = max(matches(:,end:-1:1),[],2);
col_ID = size(matches,2) - flipped_col_ID + 1;
%// Select the rows from Yearly based on col IDs and create the output table
out = [Returns table(Yearly.Equity(col_ID), Yearly.Date(col_ID))]
代码运行-
Returns =
Id Date Return
__ ___________ ________
1 31-Oct-2013 0.045158
1 30-Nov-2013 0.071319
1 31-Dec-2013 0.52357
1 31-Jan-2014 -0.65424
1 28-Feb-2014 1.8452
2 31-Oct-2013 0.037262
2 30-Nov-2013 0.38369
2 31-Dec-2013 1.1972
2 31-Jan-2014 -0.54708
2 28-Feb-2014 -0.15706
2 31-Mar-2014 0.11882
Yearly =
Id Date Equity
__ ___________ ______
1 31-Dec-2011 8
1 31-Dec-2012 10
1 31-Dec-2013 11
2 31-Dec-2012 30
2 31-Dec-2013 28
out =
Id Date Return Var1 Var2
__ ___________ ________ ____ ___________
1 31-Oct-2013 0.045158 10 31-Dec-2012
1 30-Nov-2013 0.071319 10 31-Dec-2012
1 31-Dec-2013 0.52357 11 31-Dec-2013
1 31-Jan-2014 -0.65424 11 31-Dec-2013
1 28-Feb-2014 1.8452 11 31-Dec-2013
2 31-Oct-2013 0.037262 30 31-Dec-2012
2 30-Nov-2013 0.38369 30 31-Dec-2012
2 31-Dec-2013 1.1972 28 31-Dec-2013
2 31-Jan-2014 -0.54708 28 31-Dec-2013
2 28-Feb-2014 -0.15706 28 31-Dec-2013
2 31-Mar-2014 0.11882 28 31-Dec-2013
通用案例解决方案
对于某些情况,当 Ids
可能是非数字且 dates
尚未排序时,您可以尝试以下代码 -
%// Inputs
Returns = table([repmat('Id1',5,1);repmat('Id2',6,1)],[(datetime(2013,10,31):...
calmonths(1):datetime(2014,2,28)).';(datetime(2013,10,31):calmonths(1):...
datetime(2014,3,31)).'],randn(11,1),'VariableNames',{'Id','Date','Return'})
Yearly = table([repmat('Id1',3,1);repmat('Id2',2,1)],[(datetime(2011,12,31):...
calyears(1):datetime(2013,12,31)).';(datetime(2012,12,31):calyears(1):...
datetime(2013,12,31)).'],[8;10;11;30;28],'VariableNames',{'Id','Date','Equity'})
%// -- Convert strings based Ids into numeric ones
alltypes = cellstr([Returns.Id ; Yearly.Id]);
[~,~,IDs] = unique(alltypes,'stable');
lbls_len = size(Returns.Id,1);
Returns_Id = IDs(1:lbls_len);
Yearly_Id = IDs(lbls_len+1:end);
%// Get Returns and Yearly Dates
Returns_Date = datenum(Returns.Date);
Yearly_Date = datenum(Yearly.Date);
%// Sort the dates if not already sorted
y1 = arrayfun(@(n) sort(Returns_Date(Returns_Id==n)),1:max(Returns_Id),'Uni',0);
Returns_Date = vertcat(y1{:});
y2 = arrayfun(@(n) sort(Yearly_Date(Yearly_Id==n)),1:max(Yearly_Id),'Uni',0);
Yearly_Date = vertcat(y2{:});
%// Counts of Ids to be used as boundaries when saving output at each
%// iteration correspondin to each ID
Yearly_Id_counts = [0 ; histc(Yearly_Id,1:max(Yearly_Id))];
Returns_Id_counts = histc(Returns_Id,1:max(Returns_Id));
%// Initializations
stop = 0;
col_ID = zeros(size(Returns_Date,1),1);
for iter = 1:max(Returns_Id)
%// Get mask of matches for each ID in Returns against each ID in Yearly
matches = bsxfun(@ge,Returns_Date(Returns_Id==iter),...
Yearly_Date(Yearly_Id==iter)'); %//'
%// Get the ID (column -ID) of the last match for each Id in Returns
[~,flipped_col_ID] = max(matches(:,end:-1:1),[],2);
%// Get start and stop for indexing into output column IDs array
start = stop + 1;
stop = start + Returns_Id_counts(iter) - 1;
%// Get the columns IDs to be used for indexing into Yearly data for
%// getting the final output
col_ID(start:stop) = Yearly_Id_counts(iter) + ...
Yearly_Id_counts(iter + 1) - flipped_col_ID + 1;
end
%// Select the rows from Yearly based on col IDs and create the output table
out = [Returns table(Yearly.Equity(col_ID), Yearly.Date(col_ID))]
代码运行-
Returns =
Id Date Return
___ ___________ ________
Id1 31-Oct-2013 0.53767
Id1 30-Nov-2013 1.8339
Id1 31-Dec-2013 -2.2588
Id1 31-Jan-2014 0.86217
Id1 28-Feb-2014 0.31877
Id2 31-Oct-2013 -1.3077
Id2 30-Nov-2013 -0.43359
Id2 31-Dec-2013 0.34262
Id2 31-Jan-2014 3.5784
Id2 28-Feb-2014 2.7694
Id2 31-Mar-2014 -1.3499
Yearly =
Id Date Equity
___ ___________ ______
Id1 31-Dec-2011 8
Id1 31-Dec-2012 10
Id1 31-Dec-2013 11
Id2 31-Dec-2012 30
Id2 31-Dec-2013 28
out =
Id Date Return Var1 Var2
___ ___________ ________ ____ ___________
Id1 31-Oct-2013 0.53767 10 31-Dec-2012
Id1 30-Nov-2013 1.8339 10 31-Dec-2012
Id1 31-Dec-2013 -2.2588 11 31-Dec-2013
Id1 31-Jan-2014 0.86217 11 31-Dec-2013
Id1 28-Feb-2014 0.31877 11 31-Dec-2013
Id2 31-Oct-2013 -1.3077 30 31-Dec-2012
Id2 30-Nov-2013 -0.43359 30 31-Dec-2012
Id2 31-Dec-2013 0.34262 28 31-Dec-2013
Id2 31-Jan-2014 3.5784 28 31-Dec-2013
Id2 28-Feb-2014 2.7694 28 31-Dec-2013