将 space 个带分隔符的文本文件读入 SAS

Question

我有以下 .txt 文件：

Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR 
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
....

我正在尝试将其输入到我的 SAS 程序中，以便它看起来像这样：

  Mark  Country   Type  Count   Price

1 Mark1 Country1  type1   1     1.50 
2 Mark1 Country1  type2   5     21.00 
3 Mark1 Country1  type3   NA     NA 
4 Mark2 Country2  type1   2     197.50 
5 Mark2 Country2  type2   2     201.00 
6 Mark2 Country2  type3   1     312.50

或者其他东西，但我需要它能够打印双向报告

       Country1   Country2 
Type1    ...        ...  
Type2    ...        ...   
Type3    ...        ...

但问题是如何读取那种txt文件：

读取 Mark1[Country1] 并将其分成两列 Mark 和 Country；
保留标记和国家并读取每种类型的信息（+以某种方式忽略 type1=，可能使用格式）并将其输入 table。也许有一种方法可以使用某种输入模板来实现该查询或 nasted 查询。

Answer 1

您可以在 INFILE 语句中使用 DLM= 选项指定变量名。这样您就可以根据正在读取的行的类型更改分隔符。

看起来你每组三行。第一个具有 MARK 和 COUNTRY 值。第二个有一个 COUNT 值列表，第三个有一个 PRICE 值列表。所以像这样的东西应该有用。

data want ;
  length dlm  ;
  length Mark  Country  rectype  recno 8 type  value1 8 value2  ;
  infile cards dlm=dlm truncover ;
  dlm='[]';
  input mark country ;
  dlm='= ';
  do rectype='Count','Price';
    do recno=1 by 1 until(type=' ');
      input type value1 @;
      if rectype='Price' then input value2 @;
      if type ne ' ' then output;
    end;
    input;
  end;
cards;
Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR 
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;

结果：

Obs    Mark     Country     rectype    recno    type     value1    value2

  1    Mark1    Country1     Count       1      type1       1.0
  2    Mark1    Country1     Count       2      type2       5.0
  3    Mark1    Country1     Price       1      type1       1.5     EUR
  4    Mark1    Country1     Price       2      type2      21.0     EUR
  5    Mark2    Country2     Count       1      type1       2.0
  6    Mark2    Country2     Count       2      type2       1.0
  7    Mark2    Country2     Count       3      type3       1.0
  8    Mark2    Country2     Price       1      type1     197.5     EUR
  9    Mark2    Country2     Price       2      type2     201.0     EUR
 10    Mark2    Country2     Price       3      type3     312.5     EUR

Answer 2

您有 3 name/value 对，但这些对被分成两行。需要创造性输入的不寻常的文本文件。 INPUT 语句具有行控制功能 # 以读取隐式数据步循环中的相关未来行。

示例（过程报告）

从当前行（相对行 #1）读取 mark 和 country，使用 #2 和 prices 来自相关行 #3。在为给定的 mark country 输入 name/value 之后，执行基于数组的主元，一次将两个变量（count 和 price）转换为一个分类（type）数据形式。

Proc REPORT 生成一个 'two-way' 列表。该列表实际上是一份汇总报告（计数和价格下的单元格是默认的 SUM 聚合），但每个单元格只有一个贡献值，因此 SUM 是原始的单个值。

data have(keep=Mark Country Type Count Price);
  attrib mark country length=;

  infile cards delimiter='[ ]' missover; 

  input mark country;

  input #2 @'type1=' count_1 @'type2=' count_2 @'type3=' count_3;
  input #3 @'type1=' price_1 @'type2=' price_2 @'type3=' price_3;

  array counts count_:;
  array prices price_:;

  do _i_ = 1 to dim(counts);
    Type = cats('type',_i_);
    Count = counts(_i_);
    Price = prices(_i_);
    output;
  end;
datalines;
Mark1[Country1]
type1=1 type2=5 
type1=1.50 EUR type2=21.00 EUR
Mark2[Country2]
type1=2 type2=1 type3=1 
type1=197.50 EUR type2=201.00 EUR type3= 312.50 EUR
;

ods html file='twoway.html';

proc report data=have;
  column type country,(count price);
  define type / group;
  define country / ' ' across;
run;

ods html close;

输出图片

组合聚合

proc means nway data=have noprint;
  class type country;
  var count price;
  output out=stats max(price)=price_max sum(count)=count_sum;
run;

data cells;
  set stats;
  if not missing(price_max) then 
    cell = cats(price_max,'(',count_sum,')');
run;

proc transpose data=cells out=twoway(drop=_name_);
  by type;
  id country;
  var cell;
run;

proc print noobs data=twoway;
run;

将 space 个带分隔符的文本文件读入 SAS

Reading space delimited text file into SAS

input

sas

txt