perl 将十六进制打印为人类可读的

perl print hexadecimal as human-readable

文件 forsojunk 如下(更多行未显示)。

    s/e\x27\x27\x27/é/g; 
    s/e\x27/é/g; 
    s/a\x5f/à/g; 

junk.pl如下

#! /usr/bin/perl
use strict; use warnings;
while(<>) {
   $_ =~ s/s\x2f([^\x2f\x5c]+)([^\x2f]*)\x2f([^\x2f]*).*/1=; 2=; 3=/ ;
   print ;
   print ;
   print " -> ";
   print ;
   print "\n";
}

这给出了

> junk.pl forsojunk
e\x27\x27\x27 -> é
e\x27 -> é
a\x5f -> ò

但我不想打印出 \x27\x27\x27 这样的十六进制代码。我想打印出它的样子,可读的形式。在第一行,</code> 应打印为 <code>''' 并且 第一行的整个 "message" 应该是

e''' -> é

如何做到这一点?

您需要将每个 2 位十六进制代码转换为可打印字符。

pack 有效,或者使用带有 hex 的循环将 base-16 字符串转换为数字,然后 chrprintf 等转换为对应的字符:

#!/usr/bin/env perl
use strict;
use warnings;
use open qw/IO :locale :std/;

while(<>) {
    # Note the cleaned up regular expression
    if (my ($base, $rawaddons, $result) = m{s/([^/\]+)([^/]*)/([^/]*)/}) {
        my @addons = split/\x/, $rawaddons; # Split up the hexcodes and remove the \x parts
        shift @addons; # Drop the first empty element
        print $base;
        # Any of the below ways work
        print pack('(H2)*', @addons);
        # printf '%c', hex for @addons;
        # print map { chr hex } @addons;
        print " -> $result\n";
    }
}

示例:

$ perl junk.pl forsojunk
e''' -> é
e' -> é
a_ -> à

这是@Shawn 答案的精简版,现在命名为 bbb.pl

#!/usr/bin/env perl
use strict; use warnings;
while(<>) {
   if (my ($base, $rawaddons, $result) = m{s/([^/\]+)([^/]*)/([^/]*)/}) {
        my @addons = split/\x/, $rawaddons; # Split up the hexcodes and remove the \x parts
        shift @addons; #first element is blank or null       
        print $base;
        print pack('(H2)*', @addons);
        print " -> $result\n";
    }
}

例子运行:

> cat forsojunk
    s/\x3c/\x5ctextless{}/g; # < becomes \textless{}
    s/a\x27\x27\x27/á/g; #2020v05v21vThuv17h26m54s
    s/e\x27\x27\x27/é/g; #2020v05v21vThuv17h26m54s
    s/i\x27\x27\x27/í/g; #2020v05v21vThuv17h26m54s
    s/o\x27\x27\x27/ó/g; #2020v05v21vThuv17h26m54s
    s/u\x27\x27\x27/ú/g; #2020v05v21vThuv17h26m54s
    s/A\x27\x27\x27/Á/g; #2020v05v21vThuv17h26m54s
    s/E\x27\x27\x27/É/g; #2020v05v21vThuv17h26m54s
    s/I\x27\x27\x27/Í/g; #2020v05v21vThuv17h26m54s
    s/O\x27\x27\x27/Ó/g; #2020v05v21vThuv17h26m54s
    s/U\x27\x27\x27/Ú/g; #2020v05v21vThuv17h26m54s
    s/e\x60/è/g;
    s/E\x60/È/g;
    s/a\x60/à/g;
    s/A\x60/À/g;
    s/i\x5e/î/g;
    s/I\x5e/Î/g;
    s/o\x5e/ô/g;
    s/O\x5e/Ô/g;
    s/u\x3a/ü/g;
    s/U\x3a/Ü/g;
    s/a\x3a/ä/g;
    s/A\x3a/Ä/g;
    s/o\x3a/ö/g;
    s/O\x3a/Ö/g;
> bbb.pl forsojunk
a''' -> á
e''' -> é
i''' -> í
o''' -> ó
u''' -> ú
A''' -> Á
E''' -> É
I''' -> Í
O''' -> Ó
U''' -> Ú
e` -> è
E` -> È
a` -> à
A` -> À
i^ -> î
I^ -> Î
o^ -> ô
O^ -> Ô
u: -> ü
U: -> Ü
a: -> ä
A: -> Ä
o: -> ö
O: -> Ö