如何获取字符串开头对应字符的个数

Question

我有两个字符串，例如$must 和 $is。它们在开始时应该是相同的。但是如果有错误，我想知道在哪里。

示例：

my $must = "abc;def;ghi";
my $is   = "abc;deX;ghi";

第6位的“X”不相等，所以这是我需要的结果。

所以我需要类似

的东西

my $count = count_equal_chars($is, $must);

结果为“6”，因为字符 0 到 5 相等。 （是不是“5”也没关系因为写成++没问题。）

到目前为止我的代码：

编辑：添加了解决方法。

#!/usr/bin/perl

use strict;
use warnings;
use utf8;

my $head_spec = "company;customer;article;price"; # specified headline
my $count = 0;                                    # row counter

while (<DATA>) {
    s/[\r\n]//;      # Data comes originally from Excel ...
    if (!$count) {
        # Headline:

        ## -> error message without position:
        ##print "error in headline\n" unless ($head_spec eq $_);

        ## -> writeout error message with position:
        next if ($head_spec eq $_);
        # Initialize char arrays and counter
        my @spec = split //, $head_spec; # Specified headline as character-array
        my @is   = split //, $_;         # Readed    headline as character-array
        my $err_pos = 0;                 # counter - current position
        # Find out the position:
        for (@spec) {
             $err_pos++, next if $is[$err_pos] eq $_;
             last;
        }
        # Writeout error message
        print "error in headline at position: $err_pos\n";
    }
    else {
        # Values
        print "process line $count, values: $_\n";
    }
}
continue { $count++; }

__DATA__
company;custXomer;article;price
Ser;0815;4711;3.99
Ser;0816;4712;4.85

背景：
背景是，有一个 .csv-file 带有很长的 header （>1000 个字符）。这 header 指定。如果其中有错误，则文件有错误，必须由用户编辑。所以告诉他错误在哪里是有用的，这样他就不需要比较整行了。

Answer 1

我们可以逐个字符地比较，也可以考虑字符串的长度以应对边缘情况：

use strict;
use warnings;
use List::Util qw<min max>;

my $must = "company;customer;article;price";
my $is   = "company;custXomer;article;price";

# to-be reported position
my $pos = 0;

# get minimum and maximum of the lengths
my @lengths = map length, ($must, $is);
my $min_length = min @lengths;
my $max_length = max @lengths;

# increment till an inequality occurs or a string is consumed fully
++$pos until substr($must, $pos, 1) ne substr($is, $pos, 1) || $pos == $min_length;

# report the result
print $pos == $min_length ? ($pos < $max_length ? "missing cols" : "no diff") : $pos;

如果最后位置等于最小长度，则有 2 个选项：要么它们完全相等，要么一个更长，所以我们检查最大长度。否则，按原样报告位置。

Answer 2

使用xor、^运算符可以找到错误位置。在连续字母匹配的情况下，异或运算给出 [=13=].

$-{0] 是 last match start variable（对于前一行中的正则表达式，(($must ^ $is) =~ /[^[=15=]]/).

你可以这样找到职位：

#!/usr/bin/perl
use strict;
use warnings;

my $must = "company;customer;article;price";
my $is   = "company;custXomer;article;price";


($must ^ $is) =~ /[^[=10=]]/; # find first non-matching character

print "Error position is ", $-[0];  # position of first non-matching char

打印：12

Answer 3

以下代码

将行拆分为两个字符数组 @must 和 @is
比较数组的长度并在它们不同时发出警告
然后比较数组直到第一次不匹配
如果 $pos 与 $must 字符串的最后一个索引不匹配则打印错误

use strict;
use warnings;
use feature 'say';

my $must = "abc;def;ghi";
my $is   = "abc;deX;ghi";

my @must = split('', $must);
my @is   = split('', $is);
my $pos;

warn "Warning: length is differ"
    unless $#must == $#is;

for ( 0..$#must ) {
    $pos = $_;
    last unless $must[$pos] eq $is[$pos];
}

say "Error: The strings differ at position $pos"
    unless $pos == $#must;

输出

Error: The strings differ at position 6

如何获取字符串开头对应字符的个数

How to get the number of the corresponding characters at the start of a string

string

perl

compare