将复杂的 CSV 格式读入 Matlab

Reading a Complex CSV format Into Matlab

4,7,33  308:0.364759856031284 1156:0.273818346738286 1523:0.17279792082766 9306:0.243665855423149 
7,4,33  1156:0.185729429759684 1681:0.104443202690279 5351:0.365670526234034 6212:0.0964006003127458 

我有一个上述格式的文本文件。前 3 列是标签,需要提取到另一个文件中以保持顺序不变。

在浏览文件时,每一行都有不同数量的标签。我一直在阅读 Matlab 中的 csvread 函数,但它是通用的,在上述情况下不起作用。

接下来,我需要提取

308:0.364759856031284 1156:0.273818346738286 1523:0.17279792082766 9306:0.243665855423149 

这样在行 1 的矩阵的 308 列,我输入值 0.364759856031284.

由于您的行格式发生变化,您需要逐行读取文件

fid = fopen(filename, 'rt'); % open file for text reading
resultMatrix = []; % start with empty matrix 
while 1   
  line = fgetl(fid);  
  if ~ischar(line), break, end
  % get the value separators, :
  seps = strfind(line, ':');
  % get labels
  whitespace = iswhite(line(1:seps(1)-1)); % get white space
  lastWhite = find(whitespace, 'last'); % get last white space pos
  labelString = deblank(line(1:lastWhite)); % string with all labels separated by ,
  labelString(end+1) = ','; % add final , to match way of extraction
  ix = strfind(labelString, ','); % get pos of ,
  labels = zeros(numel(ix),1); % create the labels array
  start = 1;
  for n1 = 1:numel(labels)
    labels(n1,1) = str2num(labelString(start:ix(n1)-1));
  end
  % continue and do a similar construction for the COL:VALUE pairs
  % If you do not know how many rows and columns your final matrix will have
  % you need to continuously update the size 1 row for each loop and
  % then add new columns if any of the current rows COLs are outside current matrix
  cols = zeros(numel(seps,1);
  values = cols;
  for n1 = 1:numel(seps)
    % find out current columns start pos and value end pos using knowledge of
    % the separator, :, position and that whitespace separates COL:VALUE pairs
    cols(n1,1) = str2num(line(colStart:seps(n1)-1));
    values(n1,1) = str2num(line(seps(n1)+1:valueEnd));
  end
  if isempty(resultMatrix)
    resultMatrix = zeros(1,max(cols)); % make the matrix large enough to hold the last col
  else
    [nR,nC] = size(resultMatrix); % get current size
    if max(cols) > nC
      % add columns to resultMatrix so we still can fit column data
      resultMatrix = [resultMatrix, zeros(nR, max(cols)-nC)];
    end
    % add row
    resultMatrix = [resultMatrix;zeros(1,max(nC,max(cols)))];
  end
  % loop through cols and values and populate the resultMatrix with wanted data
end
fclose(fid);  

请注意,以上代码未经测试,因此会包含错误,但应该很容易修复。 我故意遗漏了次要部分,填写起来应该不会太困难。