解包第一个字节指示长度的数据结构
unpacking a data structure whose first byte indicates length
我正在尝试从 IBM AFP 格式文件中解压 TLE(标记逻辑元素)。
规范 (http://www.afpcinc.org/wp-content/uploads/2017/12/MODCA-Reference-09.pdf) 表示这是两个三元组(尽管有四个值),其结构如下(具有 字节偏移量 ):
0:长度 | 1:Tid | 2-n:参数(=2:类型+3:格式+ 4-n: EBCDIC 编码字符串)
示例(有两个三元组,一个表示名称,一个表示值):
0C 02 0B 00 C3 A4 99 99 85 95 83 A8 07 36 00 00 C5 E4 D9
12 KEY UID CHAR C u r r e n c y 7 VAL RESERVED E U R
我用Perl解析如下(并成功):
if ($key eq 'Data') {
my $tle = $member->{struct}->{$key};
my $k_length = hex(unpack('H2', substr($tle, 0, 1)));
my $key = decode('cp500', substr($tle, 4, $k_length - 4));
my $v_length = hex(unpack('H2', substr($tle, $k_length, 1)));
my $value = decode('cp500', substr($tle, $k_length + 4, $v_length - 4));
print("'$key' => '$value'\n");
}
结果:
'Currency' => 'EUR'
虽然上面是成功的,但我觉得我的方法有点太复杂了,还有更有效的方法来做到这一点。例如。 pack
模板是否支持读取第一个 n 字节以用作要解压的连续字节数的量词?我阅读了 Perl 包教程,但似乎无法找到相关内容。
如果长度字段不包含自身,您可以执行以下操作:
(my $record, $unparsed) = unpack("C/a a*", $unparsed);
my $key = decode("cp500", unpack("x3 a*", $record));
但是长度字段包括它自己。
(my $length, $unparsed) = unpack("C a*", $unparsed);
(my $record, $unparsed) = unpack("a".($length-1)." a*", $unparsed);
my $key = decode("cp500", unpack("x3 a*", $record));
请查看以下演示代码是否满足您的要求。
这个代码
defines hash decoder subroutines
reads hex representation of bytes provided by OP from DATA block
converts read data into binary representation $data utilizing pack
extracts length, key/tid, type by utilizing unpack
call decoder subroutine for this particular type
gets back hash consisting two arrays keys and vals
forms new hash %data with provided keys and vals
outputs keys and values (returned keys are used to preserve byte/field order)
注意:Encode 'from_to' is utilized to decode EBCDIC -- alternative
use strict;
use warnings;
use feature 'say';
use utf8;
use Encode 'from_to';
my $debug = 1;
my %decoder = (
1 => \&decode_type1,
2 => \&decode_currency,
3 => \&decode_type3,
4 => \&decode_type4,
5 => \&decode_type5
);
my $bytes = read_bytes();
my($len,$key,$type) = unpack('C3',$bytes);
my $data = $decoder{$type}($bytes);
my %data;
@data{@{$data->{keys}}} = @{$data->{vals}};
say '
Unpacked data
---------------';
printf "%-8s => %s\n", $_, $data{$_} for @{$data->{keys}};
sub read_bytes {
my $hex_bytes = <DATA>;
chomp $hex_bytes;
my $bytes = pack('H*',$hex_bytes);
return $bytes;
}
sub show_bytes {
my $data = shift;
print "Bytes: ";
printf "%02X ", $_ for unpack 'C*', $data;
print "\n";
}
sub decode_type1 {
my $bytes = shift;
return { keys => 'type1', vals => 'vals1' };
}
sub decode_currency {
my $bytes = shift;
show_bytes($bytes) if $debug;
my @keys = qw/length_1 key uid char data_1 length_2 val reserved data_2/;
my @vals = unpack('C4A8C2SA3',$bytes);
from_to($vals[4], 'cp37', 'latin1');
from_to($vals[8], 'cp37', 'latin1');
return { keys => \@keys, vals => \@vals};
}
sub decode_type3 {
my $bytes = shift;
return { keys => 'type3', vals => 'vals3' };
}
sub decode_type4 {
my $bytes = shift;
return { keys => 'type4', vals => 'vals4' };
}
sub decode_type5 {
my $bytes = shift;
return { keys => 'type5', vals => 'vals5' };
}
__DATA__
0C020B00C3A49999859583A807360000C5E4D9
输出
Bytes: 0C 02 0B 00 C3 A4 99 99 85 95 83 A8 07 36 00 00 C5 E4 D9
Unpacked data
---------------
length_1 => 12
key => 2
uid => 11
char => 0
data_1 => Currency
length_2 => 7
val => 54
reserved => 0
data_2 => EUR
注:
val 只占用一个字节,这看起来很可疑,它给出了欧元金额的 0..255 范围。也许 reserved 字节可能是 val 欧元数量的一部分。
我正在尝试从 IBM AFP 格式文件中解压 TLE(标记逻辑元素)。
规范 (http://www.afpcinc.org/wp-content/uploads/2017/12/MODCA-Reference-09.pdf) 表示这是两个三元组(尽管有四个值),其结构如下(具有 字节偏移量 ):
0:长度 | 1:Tid | 2-n:参数(=2:类型+3:格式+ 4-n: EBCDIC 编码字符串)
示例(有两个三元组,一个表示名称,一个表示值):
0C 02 0B 00 C3 A4 99 99 85 95 83 A8 07 36 00 00 C5 E4 D9
12 KEY UID CHAR C u r r e n c y 7 VAL RESERVED E U R
我用Perl解析如下(并成功):
if ($key eq 'Data') {
my $tle = $member->{struct}->{$key};
my $k_length = hex(unpack('H2', substr($tle, 0, 1)));
my $key = decode('cp500', substr($tle, 4, $k_length - 4));
my $v_length = hex(unpack('H2', substr($tle, $k_length, 1)));
my $value = decode('cp500', substr($tle, $k_length + 4, $v_length - 4));
print("'$key' => '$value'\n");
}
结果:
'Currency' => 'EUR'
虽然上面是成功的,但我觉得我的方法有点太复杂了,还有更有效的方法来做到这一点。例如。 pack
模板是否支持读取第一个 n 字节以用作要解压的连续字节数的量词?我阅读了 Perl 包教程,但似乎无法找到相关内容。
如果长度字段不包含自身,您可以执行以下操作:
(my $record, $unparsed) = unpack("C/a a*", $unparsed);
my $key = decode("cp500", unpack("x3 a*", $record));
但是长度字段包括它自己。
(my $length, $unparsed) = unpack("C a*", $unparsed);
(my $record, $unparsed) = unpack("a".($length-1)." a*", $unparsed);
my $key = decode("cp500", unpack("x3 a*", $record));
请查看以下演示代码是否满足您的要求。
这个代码
defines hash decoder subroutines
reads hex representation of bytes provided by OP from DATA block
converts read data into binary representation $data utilizing pack
extracts length, key/tid, type by utilizing unpack
call decoder subroutine for this particular type
gets back hash consisting two arrays keys and vals
forms new hash %data with provided keys and vals
outputs keys and values (returned keys are used to preserve byte/field order)
注意:Encode 'from_to' is utilized to decode EBCDIC -- alternative
use strict;
use warnings;
use feature 'say';
use utf8;
use Encode 'from_to';
my $debug = 1;
my %decoder = (
1 => \&decode_type1,
2 => \&decode_currency,
3 => \&decode_type3,
4 => \&decode_type4,
5 => \&decode_type5
);
my $bytes = read_bytes();
my($len,$key,$type) = unpack('C3',$bytes);
my $data = $decoder{$type}($bytes);
my %data;
@data{@{$data->{keys}}} = @{$data->{vals}};
say '
Unpacked data
---------------';
printf "%-8s => %s\n", $_, $data{$_} for @{$data->{keys}};
sub read_bytes {
my $hex_bytes = <DATA>;
chomp $hex_bytes;
my $bytes = pack('H*',$hex_bytes);
return $bytes;
}
sub show_bytes {
my $data = shift;
print "Bytes: ";
printf "%02X ", $_ for unpack 'C*', $data;
print "\n";
}
sub decode_type1 {
my $bytes = shift;
return { keys => 'type1', vals => 'vals1' };
}
sub decode_currency {
my $bytes = shift;
show_bytes($bytes) if $debug;
my @keys = qw/length_1 key uid char data_1 length_2 val reserved data_2/;
my @vals = unpack('C4A8C2SA3',$bytes);
from_to($vals[4], 'cp37', 'latin1');
from_to($vals[8], 'cp37', 'latin1');
return { keys => \@keys, vals => \@vals};
}
sub decode_type3 {
my $bytes = shift;
return { keys => 'type3', vals => 'vals3' };
}
sub decode_type4 {
my $bytes = shift;
return { keys => 'type4', vals => 'vals4' };
}
sub decode_type5 {
my $bytes = shift;
return { keys => 'type5', vals => 'vals5' };
}
__DATA__
0C020B00C3A49999859583A807360000C5E4D9
输出
Bytes: 0C 02 0B 00 C3 A4 99 99 85 95 83 A8 07 36 00 00 C5 E4 D9
Unpacked data
---------------
length_1 => 12
key => 2
uid => 11
char => 0
data_1 => Currency
length_2 => 7
val => 54
reserved => 0
data_2 => EUR
注:
val 只占用一个字节,这看起来很可疑,它给出了欧元金额的 0..255 范围。也许 reserved 字节可能是 val 欧元数量的一部分。