使用 Matlab 从文本文件中提取数据(特定单词)
Extraction of data (specific word) from a text file using Matlab
我正在尝试从文本文件中获取一些特定信息,但我的代码没有产生我需要的结果。我的文件示例是:
2017-10-02T15:29:47.18Z 'I|PSnd: 61|snd[3D]:FFFF m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1234|WDTimeout: 300'
2017-10-02T15:29:47.18Z 'D|Beat:1256|sd:0x6564: e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1276|sprts'
2017-10-02T15:29:47.18Z 'D|Beat:5460|GetPckt:0x3901'
2017-10-02T15:29:47.18Z 'D|Beat:7085|Prtns->'
2017-10-02T15:29:47.18Z 'D|Beat:1975|sevt:72'
2017-10-02T15:29:47.18Z 'D|Beat:1780|snd:0x3901'
2017-10-02T15:29:47.18Z 'I|PSnd: 61|snd[B0]:FFFF m:0x3901 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x3901 e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1833|sd:0x3901:0'
2017-10-02T15:29:47.18Z 'D|Beat:1200|Rcv<-RP, s:1402'
2017-10-02T15:29:47.18Z 'D|Beat:1220|FrMsg:0x467b QMsg:0x5840'
2017-10-02T15:29:47.18Z 'I|Beat:13031|n:1402 rssi:-91, lqi:255, q:61'
2017-10-02T15:29:47.18Z 'D|Beat:8868|sameRP'
2017-10-02T15:29:47.18Z 'D|Beat:5460|GetPckt:0x41a1'
2017-10-02T15:29:47.18Z 'D|Beat:1975|sevt:40'
2017-10-02T15:29:47.22Z 'D|Beat:13282|PR->:1402 LRPID:C1402'
2017-10-02T15:29:47.22Z 'D|Beat:1780|snd:0x41a1'
2017-10-02T15:29:47.22Z 'D|Beat:1791|evtT:3498847'
2017-10-02T15:29:47.22Z 'I|PSnd: 61|snd[3D]:1402 m:0x41a1 e:0'
2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a1 e:0'
2017-10-02T15:29:47.22Z 'D|Beat:1234|WDTimeout: 300'
2017-10-02T15:29:47.22Z 'D|Beat:1256|sd:0x41a1: e:0'
2017-10-02T15:29:47.22Z 'D|Beat:1200|Rcv<-RP, s:1202'
2017-10-02T15:29:47.22Z 'D|Beat:1220|FrMsg:0x502a QMsg:0x3eef'
2017-10-02T15:29:47.22Z 'I|Beat:13031|n:1202 rssi:-94, lqi:255, q:60'
2017-10-02T15:29:47.22Z 'D|Beat:8868|sameRP'
2017-10-02T15:29:47.22Z 'D|Beat:5460|GetPckt:0x51c8'
2017-10-02T15:29:47.22Z 'D|Beat:1975|sevt:40'
2017-10-02T15:29:47.22Z 'D|Beat:13282|PR->:1202 LRPID:61202'
2017-10-02T15:29:47.22Z 'D|Beat:1780|snd:0x51c8'
2017-10-02T15:29:47.22Z 'D|Beat:1791|evtT:3498847'
2017-10-02T15:29:47.22Z 'I|PSnd: 61|snd[3D]:1202 m:0x51c8 e:0'
2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c8 e:0'
在上面的文件中,我试图提取包含 'sD' 的每一行,但前一行必须包含 'snd'。我试图在某些输出列中同时获取日期和值 [3D],并且可能在不同的数组中获取所有提取的行。
我做了什么:
我尝试使用 Psnd 作为查询行,这可以在下面的脚本中看到
queryline = 'PSnd';
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID);
C = C{1};
[temp,matchedLines] = regexp(C,['(?<date>^[0-9,-:T]*)Z.*' queryline ':(?<Num>[0-9A-Z|A-Z[0-9A-Z:]]*)'] ,'tokens','match');
matchedLines = [matchedLines{:}]';
temp = [temp{:}];
temp = reshape([temp{:}],2,[])';
outTime = datetime(temp(:,1),'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SSS');
[h,m,s]= hms(outTime);
time = {h; m; s};
time_in_hrs = [time{:}];
t = [time{1:3}];
nodes_in_clus = temp(:,2);
我得到了一些非常奇怪的结果,我不太理解。我最初的错误是
Error using datetime (line 556)
Numeric input data must be a matrix with three or six columns, or else three or six separate numeric arrays. You can also create datetimes from a single numeric array using the
'ConvertFrom' parameter.
Error in get_cluster (line 10)
outTime2= datetime(temp2(:,1), 'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SSS');
但在进行一些更改后,我得到了这个结果
'2017-10-02T23:58:26.62Z 'I|PSnd:'
'2017-10-02T23:58:26.77Z 'I|PSnd:'
'2017-10-02T23:58:26.77Z 'I|PSnd:'
'2017-10-02T23:58:26.91Z 'I|PSnd:'
'2017-10-02T23:58:26.91Z 'I|PSnd:'
'2017-10-02T23:58:27.06Z 'I|PSnd:'
'2017-10-02T23:58:27.06Z 'I|PSnd:'
'2017-10-02T23:58:27.20Z 'I|PSnd:'
'2017-10-02T23:58:27.20Z 'I|PSnd:'
'2017-10-02T23:58:27.35Z 'I|PSnd:'
'2017-10-02T23:58:27.35Z 'I|PSnd:'
'2017-10-02T23:58:27.49Z 'I|PSnd:'
'2017-10-02T23:58:27.49Z 'I|PSnd:'
'2017-10-02T23:58:27.64Z 'I|PSnd:'
'2017-10-02T23:58:27.64Z 'I|PSnd:'
'2017-10-02T23:58:27.79Z 'I|PSnd:'
'2017-10-02T23:58:27.79Z 'I|PSnd:'
'2017-10-02T23:58:27.93Z 'I|PSnd:'
'2017-10-02T23:58:27.93Z 'I|PSnd:'
'2017-10-02T23:58:28.06Z 'I|PSnd:'
'2017-10-02T23:58:28.06Z 'I|PSnd:'
'2017-10-02T23:58:28.21Z 'I|PSnd:'
'2017-10-02T23:58:28.21Z 'I|PSnd:'
'2017-10-02T23:58:28.36Z 'I|PSnd:'
'2017-10-02T23:58:28.36Z 'I|PSnd:'
'2017-10-02T23:58:28.51Z 'I|PSnd:'
'2017-10-02T23:58:28.51Z 'I|PSnd:'
'2017-10-02T23:58:28.65Z 'I|PSnd:'
'2017-10-02T23:58:28.65Z 'I|PSnd:'
'2017-10-02T23:58:28.79Z 'I|PSnd:'
'2017-10-02T23:58:28.79Z 'I|PSnd:'
'2017-10-02T23:58:28.94Z 'I|PSnd:'
'2017-10-02T23:58:28.94Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.87Z 'I|PSnd:'
'2017-10-02T23:58:51.87Z 'I|PSnd:'
'2017-10-02T23:58:51.92Z 'I|PSnd:'
'2017-10-02T23:58:51.92Z 'I|PSnd:'
'2017-10-02T23:58:52.02Z 'I|PSnd:'
'2017-10-02T23:58:52.02Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:59:14.29Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:31.26Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:42.64Z 'I|PSnd:'
'2017-10-02T23:59:42.66Z 'I|PSnd:'
'2017-10-02T23:59:42.79Z 'I|PSnd:'
'2017-10-02T23:59:42.79Z 'I|PSnd:'
'2017-10-02T23:59:42.94Z 'I|PSnd:'
'2017-10-02T23:59:42.94Z 'I|PSnd:'
'2017-10-02T23:59:48.24Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
我在 PSnd 之后没有得到任何东西,第二列是空的,
您可以尝试以下方法:
- 像你一样阅读文件
- 使用
cellfun
和 strfind
的组合来查找具有 snd
的行
- 对
sD
做同样的事情
以上两个将给出包含这两个标记的行的逻辑索引。
- 通过以这种方式配对两组索引来创建逻辑值矩阵:除了第一组中的最后一个之外的所有 idx,第二组中除了第一个之外的所有 idx 和
- 添加
1
- 查找具有两个
1
的矩阵的行
现在您有了要查找的行。
循环中:
- 根据空白拆分
i-th
行:第一个标记是日期
- 您的转换格式似乎不正确,您应该删除末尾的最后一个
S
并添加一个 Z
(见下文)
- 在
cellarray
中存储日期
- 用
strfind
求行[
的起点
- 对
]
做同样的事情
- 您要查找的值(例如 3D)介于
之间
- 将值存储在
cellarray
现在您在三个 cellarray
中有了日期、值和整行
注意,可能需要对具有不同行集的 inout 文件进行额外检查。
一个可能的实现可能是:
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID)
x=C{1};
% Find the row with "snd"
idx_1=~cellfun('isempty',(strfind(x,'snd')))
% Find the row with "sD"
idx_2=~cellfun('isempty',(strfind(x,'sD[')))
% Join the two indeces, shifting the second one of 1
% find the row of the matrix with 2 "1"
k=find(all([idx_1(1:end-1) idx_2(2:end)],2))+1
x{k}
% Loop over the identified rows
for i=1:length(k)
% Split the row wrt ' ', the first elemetn is the date
a=strsplit(x{i},' ')
% Convert the date
the_date{i}=datetime(a{1},'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS''Z''')
% look for the position of the "["
start_idx=strfind(x{k(i)},'[')
% look for the position of the "]"
end_idx=strfind(x{k(i)},']')
% Extract the value between the "[]"
val{i}=x{k(i)}(start_idx+1:end_idx-1)
end
关于您的 inout 文件:
所选行
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x3901 e:0'
2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a1 e:0'
2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c8 e:0'
所选行的Idx:
2
11
23
36
对应日期
the_date =
Columns 1 through 2
[02-Oct-2017 15:29:47] [02-Oct-2017 15:29:47]
Columns 3 through 4
[02-Oct-2017 15:29:47] [02-Oct-2017 15:29:47]
对应值:
val =
'3D' 'B0' '3D' '3D
这是一个没有任何 for 循环的解决方案。
基本上先搜索带有"snd"的行。然后检查下一行 "sD"。 Return 匹配行的正则表达式中匹配的行和标记。
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID);
C = C{1};
%Find all lines with snd
initMatchIdx = ~cellfun(@isempty,regexp(C,'^[0-9,-:T]*Z.*PSnd.*snd'));
%Check the lines 1 row down ...
checkIdx = [false; initMatchIdx(1:end-1)];
%If it matches return the entire line and the tokens..
[temp, matchedLines] = regexp(C(checkIdx),'(?<date>^[0-9,-:T]*)Z.*PSnd.*sD\[(?<otherVal>\w*)\].*' ,'tokens','match');
%Do some reshaping and un-celling.
matchedLines = [matchedLines{:}]';
temp = [temp{:}];
temp = reshape([temp{:}],2,[])';
%Convert to Date
outTime = datetime(temp(:,1),'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS');
otherVal = temp(:,2);
输出如下所示:
>> outTime
outTime =
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
>> otherVal
otherVal =
'3D'
'B0'
'3D'
'3D'
>> matchedLines
matchedLines =
'2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x656...'
'2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x390...'
'2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a...'
'2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c...'
我正在尝试从文本文件中获取一些特定信息,但我的代码没有产生我需要的结果。我的文件示例是:
2017-10-02T15:29:47.18Z 'I|PSnd: 61|snd[3D]:FFFF m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1234|WDTimeout: 300'
2017-10-02T15:29:47.18Z 'D|Beat:1256|sd:0x6564: e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1276|sprts'
2017-10-02T15:29:47.18Z 'D|Beat:5460|GetPckt:0x3901'
2017-10-02T15:29:47.18Z 'D|Beat:7085|Prtns->'
2017-10-02T15:29:47.18Z 'D|Beat:1975|sevt:72'
2017-10-02T15:29:47.18Z 'D|Beat:1780|snd:0x3901'
2017-10-02T15:29:47.18Z 'I|PSnd: 61|snd[B0]:FFFF m:0x3901 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x3901 e:0'
2017-10-02T15:29:47.18Z 'D|Beat:1833|sd:0x3901:0'
2017-10-02T15:29:47.18Z 'D|Beat:1200|Rcv<-RP, s:1402'
2017-10-02T15:29:47.18Z 'D|Beat:1220|FrMsg:0x467b QMsg:0x5840'
2017-10-02T15:29:47.18Z 'I|Beat:13031|n:1402 rssi:-91, lqi:255, q:61'
2017-10-02T15:29:47.18Z 'D|Beat:8868|sameRP'
2017-10-02T15:29:47.18Z 'D|Beat:5460|GetPckt:0x41a1'
2017-10-02T15:29:47.18Z 'D|Beat:1975|sevt:40'
2017-10-02T15:29:47.22Z 'D|Beat:13282|PR->:1402 LRPID:C1402'
2017-10-02T15:29:47.22Z 'D|Beat:1780|snd:0x41a1'
2017-10-02T15:29:47.22Z 'D|Beat:1791|evtT:3498847'
2017-10-02T15:29:47.22Z 'I|PSnd: 61|snd[3D]:1402 m:0x41a1 e:0'
2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a1 e:0'
2017-10-02T15:29:47.22Z 'D|Beat:1234|WDTimeout: 300'
2017-10-02T15:29:47.22Z 'D|Beat:1256|sd:0x41a1: e:0'
2017-10-02T15:29:47.22Z 'D|Beat:1200|Rcv<-RP, s:1202'
2017-10-02T15:29:47.22Z 'D|Beat:1220|FrMsg:0x502a QMsg:0x3eef'
2017-10-02T15:29:47.22Z 'I|Beat:13031|n:1202 rssi:-94, lqi:255, q:60'
2017-10-02T15:29:47.22Z 'D|Beat:8868|sameRP'
2017-10-02T15:29:47.22Z 'D|Beat:5460|GetPckt:0x51c8'
2017-10-02T15:29:47.22Z 'D|Beat:1975|sevt:40'
2017-10-02T15:29:47.22Z 'D|Beat:13282|PR->:1202 LRPID:61202'
2017-10-02T15:29:47.22Z 'D|Beat:1780|snd:0x51c8'
2017-10-02T15:29:47.22Z 'D|Beat:1791|evtT:3498847'
2017-10-02T15:29:47.22Z 'I|PSnd: 61|snd[3D]:1202 m:0x51c8 e:0'
2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c8 e:0'
在上面的文件中,我试图提取包含 'sD' 的每一行,但前一行必须包含 'snd'。我试图在某些输出列中同时获取日期和值 [3D],并且可能在不同的数组中获取所有提取的行。
我做了什么: 我尝试使用 Psnd 作为查询行,这可以在下面的脚本中看到
queryline = 'PSnd';
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID);
C = C{1};
[temp,matchedLines] = regexp(C,['(?<date>^[0-9,-:T]*)Z.*' queryline ':(?<Num>[0-9A-Z|A-Z[0-9A-Z:]]*)'] ,'tokens','match');
matchedLines = [matchedLines{:}]';
temp = [temp{:}];
temp = reshape([temp{:}],2,[])';
outTime = datetime(temp(:,1),'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SSS');
[h,m,s]= hms(outTime);
time = {h; m; s};
time_in_hrs = [time{:}];
t = [time{1:3}];
nodes_in_clus = temp(:,2);
我得到了一些非常奇怪的结果,我不太理解。我最初的错误是
Error using datetime (line 556)
Numeric input data must be a matrix with three or six columns, or else three or six separate numeric arrays. You can also create datetimes from a single numeric array using the
'ConvertFrom' parameter.
Error in get_cluster (line 10)
outTime2= datetime(temp2(:,1), 'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SSS');
但在进行一些更改后,我得到了这个结果
'2017-10-02T23:58:26.62Z 'I|PSnd:'
'2017-10-02T23:58:26.77Z 'I|PSnd:'
'2017-10-02T23:58:26.77Z 'I|PSnd:'
'2017-10-02T23:58:26.91Z 'I|PSnd:'
'2017-10-02T23:58:26.91Z 'I|PSnd:'
'2017-10-02T23:58:27.06Z 'I|PSnd:'
'2017-10-02T23:58:27.06Z 'I|PSnd:'
'2017-10-02T23:58:27.20Z 'I|PSnd:'
'2017-10-02T23:58:27.20Z 'I|PSnd:'
'2017-10-02T23:58:27.35Z 'I|PSnd:'
'2017-10-02T23:58:27.35Z 'I|PSnd:'
'2017-10-02T23:58:27.49Z 'I|PSnd:'
'2017-10-02T23:58:27.49Z 'I|PSnd:'
'2017-10-02T23:58:27.64Z 'I|PSnd:'
'2017-10-02T23:58:27.64Z 'I|PSnd:'
'2017-10-02T23:58:27.79Z 'I|PSnd:'
'2017-10-02T23:58:27.79Z 'I|PSnd:'
'2017-10-02T23:58:27.93Z 'I|PSnd:'
'2017-10-02T23:58:27.93Z 'I|PSnd:'
'2017-10-02T23:58:28.06Z 'I|PSnd:'
'2017-10-02T23:58:28.06Z 'I|PSnd:'
'2017-10-02T23:58:28.21Z 'I|PSnd:'
'2017-10-02T23:58:28.21Z 'I|PSnd:'
'2017-10-02T23:58:28.36Z 'I|PSnd:'
'2017-10-02T23:58:28.36Z 'I|PSnd:'
'2017-10-02T23:58:28.51Z 'I|PSnd:'
'2017-10-02T23:58:28.51Z 'I|PSnd:'
'2017-10-02T23:58:28.65Z 'I|PSnd:'
'2017-10-02T23:58:28.65Z 'I|PSnd:'
'2017-10-02T23:58:28.79Z 'I|PSnd:'
'2017-10-02T23:58:28.79Z 'I|PSnd:'
'2017-10-02T23:58:28.94Z 'I|PSnd:'
'2017-10-02T23:58:28.94Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:40.39Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.76Z 'I|PSnd:'
'2017-10-02T23:58:51.87Z 'I|PSnd:'
'2017-10-02T23:58:51.87Z 'I|PSnd:'
'2017-10-02T23:58:51.92Z 'I|PSnd:'
'2017-10-02T23:58:51.92Z 'I|PSnd:'
'2017-10-02T23:58:52.02Z 'I|PSnd:'
'2017-10-02T23:58:52.02Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:58:57.35Z 'I|PSnd:'
'2017-10-02T23:59:14.29Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:14.33Z 'I|PSnd:'
'2017-10-02T23:59:31.26Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:31.30Z 'I|PSnd:'
'2017-10-02T23:59:42.64Z 'I|PSnd:'
'2017-10-02T23:59:42.66Z 'I|PSnd:'
'2017-10-02T23:59:42.79Z 'I|PSnd:'
'2017-10-02T23:59:42.79Z 'I|PSnd:'
'2017-10-02T23:59:42.94Z 'I|PSnd:'
'2017-10-02T23:59:42.94Z 'I|PSnd:'
'2017-10-02T23:59:48.24Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
'2017-10-02T23:59:48.28Z 'I|PSnd:'
我在 PSnd 之后没有得到任何东西,第二列是空的,
您可以尝试以下方法:
- 像你一样阅读文件
- 使用
cellfun
和strfind
的组合来查找具有snd
的行
- 对
sD
做同样的事情
以上两个将给出包含这两个标记的行的逻辑索引。
- 通过以这种方式配对两组索引来创建逻辑值矩阵:除了第一组中的最后一个之外的所有 idx,第二组中除了第一个之外的所有 idx 和
- 添加
1
- 查找具有两个
1
的矩阵的行
现在您有了要查找的行。
循环中:
- 根据空白拆分
i-th
行:第一个标记是日期 - 您的转换格式似乎不正确,您应该删除末尾的最后一个
S
并添加一个Z
(见下文) - 在
cellarray
中存储日期
- 用
strfind
求行[
的起点 - 对
]
做同样的事情
- 您要查找的值(例如 3D)介于 之间
- 将值存储在
cellarray
现在您在三个 cellarray
注意,可能需要对具有不同行集的 inout 文件进行额外检查。
一个可能的实现可能是:
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID)
x=C{1};
% Find the row with "snd"
idx_1=~cellfun('isempty',(strfind(x,'snd')))
% Find the row with "sD"
idx_2=~cellfun('isempty',(strfind(x,'sD[')))
% Join the two indeces, shifting the second one of 1
% find the row of the matrix with 2 "1"
k=find(all([idx_1(1:end-1) idx_2(2:end)],2))+1
x{k}
% Loop over the identified rows
for i=1:length(k)
% Split the row wrt ' ', the first elemetn is the date
a=strsplit(x{i},' ')
% Convert the date
the_date{i}=datetime(a{1},'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS''Z''')
% look for the position of the "["
start_idx=strfind(x{k(i)},'[')
% look for the position of the "]"
end_idx=strfind(x{k(i)},']')
% Extract the value between the "[]"
val{i}=x{k(i)}(start_idx+1:end_idx-1)
end
关于您的 inout 文件:
所选行
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x6564 e:0'
2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x3901 e:0'
2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a1 e:0'
2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c8 e:0'
所选行的Idx:
2
11
23
36
对应日期
the_date =
Columns 1 through 2
[02-Oct-2017 15:29:47] [02-Oct-2017 15:29:47]
Columns 3 through 4
[02-Oct-2017 15:29:47] [02-Oct-2017 15:29:47]
对应值:
val =
'3D' 'B0' '3D' '3D
这是一个没有任何 for 循环的解决方案。
基本上先搜索带有"snd"的行。然后检查下一行 "sD"。 Return 匹配行的正则表达式中匹配的行和标记。
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID);
C = C{1};
%Find all lines with snd
initMatchIdx = ~cellfun(@isempty,regexp(C,'^[0-9,-:T]*Z.*PSnd.*snd'));
%Check the lines 1 row down ...
checkIdx = [false; initMatchIdx(1:end-1)];
%If it matches return the entire line and the tokens..
[temp, matchedLines] = regexp(C(checkIdx),'(?<date>^[0-9,-:T]*)Z.*PSnd.*sD\[(?<otherVal>\w*)\].*' ,'tokens','match');
%Do some reshaping and un-celling.
matchedLines = [matchedLines{:}]';
temp = [temp{:}];
temp = reshape([temp{:}],2,[])';
%Convert to Date
outTime = datetime(temp(:,1),'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS');
otherVal = temp(:,2);
输出如下所示:
>> outTime
outTime =
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
02-Oct-2017 15:29:47
>> otherVal
otherVal =
'3D'
'B0'
'3D'
'3D'
>> matchedLines
matchedLines =
'2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[3D]m:0x656...'
'2017-10-02T15:29:47.18Z 'I|PSnd: 233|sD[B0]m:0x390...'
'2017-10-02T15:29:47.22Z 'I|PSnd: 233|sD[3D]m:0x41a...'
'2017-10-02T15:29:47.24Z 'I|PSnd: 233|sD[3D]m:0x51c...'