如果该行已存在于文件中，如何避免向文件插入行？

Question

应该如何进行检查以使文件中没有重复行

open ( FILE, ">newfile");
for( $a = 1; $a < 20; $a = $a + 1 ) {
    my $random_number = 1+ int rand(10);;
    # check to avoid inserting the line if the line is already present in the file
    print FILE "Random number is $random_number \n";
}

close(FILE);

Answer 1

也将每一行输入到一个散列中，这使得以后检查它变得简单高效

use warnings;
use strict;
use feature 'say';

my $filename = shift or die "Usage: [=10=] filename\n";

open my $fh, '>', $filename or die "Can't open $filename: $!";

my %existing_lines;

for my $i (1..19) 
{ 
    my $random_number = 1 + int rand(10);

    # Check to avoid inserting the line if it is already in the file
    if (not exists $existing_lines{$random_number}) { 
        say $fh "Random number is $random_number";
        $existing_lines{$random_number} = 1;
    }   
}
close $fh;

这假定问题中的意图是不重复该数字（表示要存储的内容不重复）。

但如果确实要避免整行（句子），其中随机数仅用于使每一行不同，则使用整行作为密钥

for my $i (1..19) 
{ 
    my $random_number = 1 + int rand(10);
    my $line = "Random number is $random_number";

    # Check to avoid inserting the line if it is already in the file
    if (not exists $existing_lines{$line}) { 
        say $fh $line;
        $existing_lines{$line} = 1;
    }   
}

笔记与文献

词法文件句柄 (my $fh) 比 globs (FILE) 好得多，三参数 open 更好。请参阅指南 perlopentut and reference open
始终检查 open 调用（上面的 or die...）。它可以而且确实会失败——安静地。在该检查中，始终打印失败的错误，$!
很少需要 C 风格的 for 循环，而通常的 foreach（具有同义词 for）更好用；看到它 in perlsyn. The .. is the range operator
总是用my, and enforce that with strict pragma; always use warnings
声明变量
如果文件句柄指向管道打开（这里不是这种情况）总是检查它的 close
见perlintro for a general overview and for hashes; for more about Perl's data types see perldata. Keep in mind for later the notion of complex data structures, perldsc

Answer 2

return false 就可以了。

因为您无法在 [1, 10] 范围内生成 20 个不同的数字。

Answer 3

看来你问的是如何随机化数字 1 到 20 的顺序。即无重复，随机顺序。这可以通过 Schwartzian transform 轻松完成。例如：

perl -le'print for map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [$_, rand()] } 1..20'
6
7
16
14
5
20
3
13
19
17
4
8
15
10
9
11
18
1
2
12

在这种情况下，从末尾向后读取，我们创建一个数字列表 1 .. 20，我们将其输入 map 语句，该语句将每个数字转换为数组引用，其中包含数，和一个随机数。然后我们将该数组引用列表提供给一个排序，我们对数组引用中的第二个参数进行数字排序：随机数（因此创建随机顺序）。然后我们使用另一个 map 语句将数组 ref 转换回一个简单的数字。最后我们使用 for 循环打印列表。

所以在你的例子中，代码看起来像这样：

print "Random number is: $_\n" for      # print each number
    map { $_>[0] }                      # restore to a number
    sort { $a->[1] <=> $b->[1] }        # sort the list on the random number
    map { [ $_, rand() ] }              # create array ref with random number as index
    1 .. 20;                            # create list of numbers to randomize order of

然后您可以使用如下程序将输出重定向到文件：

$ perl numbers.pl > newfile.txt

Answer 4

!$seen{$_}++ 是识别重复项的常用习语。

my %seen;
for (1..19) { 
    my $random_number = 1+ int rand(10);

    say "Random number is $random_number" if !$seen{$random_number}++;
}

但这并不能保证您会以随机顺序获得从 1 到 10 的所有数字。如果这就是您要实现的目标，以下是更好的解决方案：

use List::Util qw( shuffle );

say "Random number is $_" for shuffle 1..10;

如果该行已存在于文件中，如何避免向文件插入行？

How to avoid line insert to the file if the line is already present in the file?

algorithm

syntax

perl