将时间戳统一为日期字符串

Question

MATLAB R2015b

我有一个 table，每行两列包含各种格式的日期字符串和时间字符串：

11.01.2016 | 00:00:00 | data

10/19/16 | 05:29:00 | data

12.02.16 | 06:40 | data

我想将这两列转换为具有通用格式的一列：

31.12.2017 14:00:00

我当前的解决方案在每一行上使用循环并将列组合为字符串，检查各种格式以将 datetime 与适当的格式字符串一起使用，然后将 datestr 与所需的格式字符串一起使用。 Datetime 无法自动确定输入字符串的格式。

如您所想，这对于大型 tables（大约 50000 行）来说非常慢。

有没有更快的解决方案？

提前致谢。

Answer 1

我尝试将代码矢量化。诀窍是

转换 tables > 单元格 > char-array，然后
处理char字符串，然后
从 char-array > 单元格 > table

此外，还有一个重要的一点，即以矢量化方式用 'null' 字符填充所有长度较短的单元格。否则，将无法从单元格 > char-array 进行转换。这是代码。 clc 全部清除

%% create Table T
d={'11.01.2016';
   '10/19/16';
   '12.02.16'};

t={'00:00:00';
  '05:29:00';
  '06:40'};
dat=[123;
    456;
    789];

T = table(d,t,dat);

%% deal with dates in Table T
% separate date column and convert to cell
dd = table2cell(T(:,1));
% equalize the lengths of all elements of cell
% by padding 'null' in end of shorter dates
nmax=max(cellfun(@numel,dd));
func = @(x) [x,zeros(1,nmax-numel(x))];
temp1 = cellfun(func,dd,'UniformOutput',false);
% convert to array for vectorized manipulation of char strings
ddd=cell2mat(temp1);
% replace the separators in 3rd and 6th location with '.' (period)
ddd(:,[3 6]) = repmat(['.' '.'], length(dd),1);
% find indexes of shorter dates 
short_year_idx = find(uint16(ddd(:,nmax)) == 0);
% find the year value for those short_year cases
yy = ddd(short_year_idx,[7 8]);
% replace null chars with '20XX' string in desirted place
ddd(short_year_idx,7:nmax) = ...
    [repmat('20',size(short_year_idx,1),1) yy];
% convert char array back to cell and replace in table
dddd = mat2cell(ddd,ones(1,size(d,1)),nmax);
T(:,1) = table(dddd);

%% deal with times in Table T
% separate time column and convert to cell
tt = table2cell(T(:,2));
% equalize the lengths of all elements of cell
% by padding 'null' in end of shorter times
nmax=max(cellfun(@numel,tt));
func = @(x) [x,zeros(1,nmax-numel(x))];
temp1 = cellfun(func,tt,'UniformOutput',false);
% convert to array for vectorized manipulation of char strings
ttt=cell2mat(temp1);
% find indexes of shorter times (assuming only ':00' in end is missing
short_time_idx = find(uint16(ttt(:,nmax)) == 0);% dirty hack, as null=0 in ascii
% replace null chars with ':00' string
ttt(short_time_idx,[6 7 8]) = repmat(':00',size(short_time_idx,1),1);
% convert char array back to cell and replace in table
tttt = mat2cell(ttt,ones(1,size(t,1)),nmax);
T(:,2) = table(tttt);

Answer 2

如果您将两列元胞数组称为 c1 和 c2，那么这样的事情应该可行：

c = detestr(datenum(strcat(c1,{' '},c2)), 'dd.mm.yyyy HH:MM:SS')

然后您需要删除旧的列并将这一 c 放在原来的位置。然而，在内部，datenum 一定在做与您正在做的类似的事情，所以我不确定这是否会更快。我怀疑这是因为（我们希望）标准函数被优化了。

如果您的 table 没有将它们表示为元胞数组，那么您可能需要执行 pre-processing 步骤来形成 strcat.[=16= 的元胞数组]

将时间戳统一为日期字符串

Unify timestamps as date strings

matlab

datetime

matlab-table