动态生成结构的解决方法 - Matlab/Octave 个问题

Workaround for dynamically generating a structure - Matlab/Octave issues

我有很多很多 .csv 格式的数据集,这些数据集是按文件名标准组织的,因此我可以第二次使用正则表达式。但是,我 运行 遇到了一个小问题。我的数据文件的标题类似于“2012001_C335_2000MHZ_P_1111.CSV”。有四年的兴趣、两个频率和四个不同的 C335 样式标签来描述位置。我需要对这些文件中的每一个进行大量数据处理,因此我想将它们全部读取到一个巨大的结构中,然后对它的不同部分进行处理。我在写:

for ix_id = 1:length(ids)
 for ix_years = 1:2:length(ids_years{ix_id})
  for ix_frq = 1:length(frqs)
   st = [ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq}'_P_1111.CSV'];
   data.(ids_frqs{ix_id}{ix_frq}).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) =...
        dlmread(st);
  end
 end
end

所有 ids 变量都是 1x4 元胞数组,其中每个元胞包含字符串。

这会产生错误: "Error: a cs-list cannot be further indexed" 和 "Error: invalid assignment to cs-list outside multiple assignment"

我在互联网上搜索了这些错误,发现了一些日期从 2010 年到 2012 年不等的帖子,例如 this one and this one,其中作者认为这是 Octave 本身的问题。我可以做一个解决方法,包括通过删除 ix_frq 上的最内层 for 循环并将以 "st" 和 "data" 开头的行替换为

来定义两个单独的结构
data.1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
  dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_1500MHZ_P_1111.CSV']);
data.2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
  dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_2000MHZ_P_1111.CSV']);

所以当我尝试制作一个更嵌套的结构时,问题似乎出现了。我想知道这是 Octave 独有的还是 Matlab 中的相同,以及是否有比定义两个单独的结构更灵活的解决方法,因为我希望它尽可能可移植。如果您对错误消息的含义有任何了解,我也对此很感兴趣。谢谢!

编辑:这是完整的脚本 - 现在生成一些虚拟 .csv 文件。在 Octave v. 3.8

上运行
clear all
%this program tests the creation of various structures.  The end goal is to have a structure of the format frequency.beamname.year(1) = matrix of the appropriate file
A = rand(3,2);
csvwrite('2009103_C115_1500MHZ.CSV',A)
csvwrite('2009103_C115_2000MHZ.CSV',A)
csvwrite('2010087_C115_1500MHZ.CSV',A)
csvwrite('2010087_C115_2000MHZ.CSV',A)
csvwrite('2009103_C335_1500MHZ.CSV',A)
csvwrite('2009103_C335_2000MHZ.CSV',A)
csvwrite('2010087_C335_1500MHZ.CSV',A)
csvwrite('2010087_C335_2000MHZ.CSV',A)

data = dir('*.CSV');  %imports all of the files of a directory
files = {data.name};  %cell array of filenames
nfiles = numel(files);

%find all the years
years = unique(cellfun(@(x)x{1},regexp(files,'\d{7}','match'),'UniformOutput',false));  
%find all the beam names
ids = unique(cellfun(@(x)x{1},regexp(files,'([C-I]\d{3})|([C-I]\d{1}[C-I]\d{2})','match'),'UniformOutput',false));
%find all the frequencies
frqs = unique(cellfun(@(x)x{1},regexp(files,'\d{4}MHZ','match'),'UniformOutput',false));

%now, vectorize to cover all the beams
for id_ix = 1:length(ids)
  expression_yrs = ['(\d{7})(?=_' ids{id_ix} ')'];
  listl_yrs = regexp(files,expression_yrs,'match');
  ids_years{id_ix} = unique(cellfun(@(x)x{1},listl_yrs(cellfun(@(x)~isempty(x),listl_yrs)),'UniformOutput',false));  %returns the years for data collected with both the 1500 and 2000 MHZ antennas along each of thebeams
  expression_frqs = ['(?<=' ids{id_ix} '_)(\d{4}MHZ)']; 
  listfrq = regexp(files,expression_frqs,'match'); %finds every frequency that was collected for C115, C335
  ids_frqs{id_ix} = unique(cellfun(@(x)x{1},listfrq(cellfun(@(x)~isempty(x),listfrq)),'UniformOutput',false));
end

%% finally, dynamically generate a structure data.Beam.Year.Frequency
%this works
for ix_id = 1:length(ids)
  for ix_year = 1:length(ids_years{ix_id})
    data1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{1} '.CSV']);
    data2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{2} '.CSV']);
  end
end

%this doesn't work
for ix_id=1:length(ids)
  for ix_year=1:length(ids_years{ix_id})
    for ix_frq = 1:numel(frqs)
        data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
    end 
  end
end

希望这有助于澄清问题 - 我不确定这里发布编辑和代码的礼节。

问题是当你到达导致问题的 for 循环时,数据已经存在并且是一个结构数组。

octave> data
data =

  8x1 struct array containing the fields:

    name
    date
    bytes
    isdir
    datenum
    statinfo

当您 select 来自结构数组的字段时,您将获得一个 cs-list(逗号分隔列表),除非您还索引了结构数组中的哪个结构。参见:

octave> data.name
ans = 2009103_C115_1500MHZ.CSV
ans = 2009103_C115_2000MHZ.CSV
ans = 2009103_C335_1500MHZ.CSV
ans = 2009103_C335_2000MHZ.CSV
ans = 2010087_C115_1500MHZ.CSV
ans = 2010087_C115_2000MHZ.CSV
ans = 2010087_C335_1500MHZ.CSV
ans = 2010087_C335_2000MHZ.CSV
octave> data(1).name
ans = 2009103_C115_1500MHZ.CSV

所以当你这样做时:

data.(...) = dlmread (...);

您不会在左侧得到您期望的内容,您会得到一个 cs-list。但我猜这是偶然的,因为 data 目前只有文件名,所以只需创建一个新的空结构:

data = struct (); # this will clear your previous data
for ix_id=1:length(ids)
  for ix_year=1:length(ids_years{ix_id})
    for ix_frq = 1:numel(frqs)
        data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
    end 
  end
end

我还建议您更好地考虑当前的解决方案。这段代码在我看来过于复杂。