随机化分隔符之间的文本
Randomizing text between delimiters
我有这个简单的输入
I have {red;green;orange} fruit and cup of {tea;coffee;juice}
我使用 Perl 来识别两个外部大括号定界符 {
和 }
之间的模式,并使用内部定界符 ;
.
随机化内部字段
我得到这个输出
I have green fruit and cup of coffee
这是我的 Perl 脚本
perl -plE 's!\{(.*?)\}!@x=split/;/,;$x[rand@x]!ge' <<< 'I have {red;green;orange} fruit and cup of {tea;coffee;juice}'
我的任务是处理这种输入格式
I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.
据我所知,脚本应该跳过第一个文本部分中的外部右大括号 { ... }
,其中包含带有左括号和右括号的文本:
{ {red;green;orange} fruit ; cup of {tea;coffee;juice} }
它应该选择一个随机的部分,像这样
{red;green;orange} fruit
或
cup of {tea;coffee;juice}
再深入一点:
green fruit
所有文本处理完成后,结果可能是以下任意一种
I have red fruit and fresh burger.
I have cup of tea and nice sandwich
I have green fruit and nice burger.
I have cup of coffee and fresh burger.
脚本也应该解析并随机化下一个文本。例如
This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA} Hubble Telescope ; {NASA;ESA} Hubble Space Telescope} }, is the {largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise known as M31;— known as M31}; [empty here] }.
This is a cropped version of the full image and has 1.5 billion pixels. { You would need more than {600;700;800} HD television screens to display the whole image. ; If you want to display the whole image, you need to download more than {1;2} Tb. traffic and use 800 HD displays }
示例输出可以是
This beautiful image, captured with the NASA Hubble Telescope, is the
sharpest image ever taken of the Andromeda galaxy — otherwise known as
M31.
This is a cropped version of the full image and has 1.5 billion
pixels. You would need more than 700 HD television screens to display
the whole image.
不贪心是个好主意,但效果不佳。你可以添加一个循环:
perl -plE 'while(s!\{([^{}]*)\}!@x=split/;/,;$x[rand@x]!ge){}'
请注意,您的样本输入有不匹配的大括号,因此这似乎输出了虚假的“}”
不错的挑战。你需要做的是找到一套没有内牙套的牙套,然后从中随机挑选一件。你需要在全球范围内这样做。这将仅替换 "level 1" 大括号。您需要遍历字符串,直到找不到更多匹配项。
use v5.18;
use strict;
use warnings;
sub rand_sentence {
my $copy = shift;
1 while $copy =~ s{ \{ ([^{}]+) \} }
{ my @words = split /;/, ; $words[rand @words] }xsge;
return $copy;
}
my $str = 'I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.';
say rand_sentence($str);
say '';
$str = <<'END';
This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA}
Hubble Telescope ; {NASA;ESA} Hubble Space Telescope }, is the
{largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise
known as M31;— known as M31}; [empty here] }. This is a cropped version of the
full image and has 1.5 billion pixels. { You would need more than {600;700;800}
HD television screens to display the whole image. ; If you want to display the
whole image, you need to download more than {1;2} Tb. traffic and use 800 HD
displays }
END
say rand_sentence($str);
示例输出
I have orange fruit and fresh sandwich.
This beautiful photography, captured with the ESA Hubble Space Telescope , is the
largest image ever taken of the Andromeda galaxy — otherwise
known as M31. This is a cropped version of the
full image and has 1.5 billion pixels. If you want to display the
whole image, you need to download more than 1 Tb. traffic and use 800 HD
displays
TXR解决。有很多方法可以解决这个问题。
假设我们正在从标准输入中读取数据。我们如何读取记录中的数据,这些数据不是由通常的换行符分隔,而是由大括号选择模式分隔?我们通过在标准输入流上创建一个记录适配器对象来做到这一点。 record-adapter
函数的第三个参数是一个布尔值,表示我们要保留终止定界符(与记录定界正则表达式匹配的部分)。
因此,如果数据看起来像这样 foo bar {bra;ces} xyzzy {a;b;c} d\n
,它会变成这些记录:foo bar {bra;ces}
、xyzzy {a;b;c}
和 d\n
。
然后,我们使用提取语言将这些记录作为文本行进行处理。它们分为两种模式:以大括号模式结尾的行和不以大括号模式结尾的行。后者只是回应。前者按随机大括号替换的要求处理。
我们还初始化 *random-state*
so that the PRNG is seeded to produce a different pseudo-random sequence on each run. If make-random-state
没有给定参数,它创建一个随机状态对象,该对象从进程 ID 和系统时间等系统参数初始化:
@(do (set *random-state* (make-random-state)))
@(next @(record-adapter #/{[\w;]+}/ *stdin* t))
@(repeat)
@ (cases)
@*text{@switch}
@ (do (put-string `@text@(first (shuffle (split-str switch ";")))`))
@ (or)
@text
@ (do (put-string text))
@ (end)
@(end)
测试运行:
$ cat data
I have {red;green;orange} fruit and cup of {tea;coffee;juice}.
$ txr rndchoose.txr < data
I have red fruit and cup of tea.
$ txr rndchoose.txr < data
I have orange fruit and cup of tea.
$ txr rndchoose.txr < data
I have green fruit and cup of coffee.
我有这个简单的输入
I have {red;green;orange} fruit and cup of {tea;coffee;juice}
我使用 Perl 来识别两个外部大括号定界符 {
和 }
之间的模式,并使用内部定界符 ;
.
我得到这个输出
I have green fruit and cup of coffee
这是我的 Perl 脚本
perl -plE 's!\{(.*?)\}!@x=split/;/,;$x[rand@x]!ge' <<< 'I have {red;green;orange} fruit and cup of {tea;coffee;juice}'
我的任务是处理这种输入格式
I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.
据我所知,脚本应该跳过第一个文本部分中的外部右大括号 { ... }
,其中包含带有左括号和右括号的文本:
{ {red;green;orange} fruit ; cup of {tea;coffee;juice} }
它应该选择一个随机的部分,像这样
{red;green;orange} fruit
或
cup of {tea;coffee;juice}
再深入一点:
green fruit
所有文本处理完成后,结果可能是以下任意一种
I have red fruit and fresh burger.
I have cup of tea and nice sandwich
I have green fruit and nice burger.
I have cup of coffee and fresh burger.
脚本也应该解析并随机化下一个文本。例如
This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA} Hubble Telescope ; {NASA;ESA} Hubble Space Telescope} }, is the {largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise known as M31;— known as M31}; [empty here] }.
This is a cropped version of the full image and has 1.5 billion pixels. { You would need more than {600;700;800} HD television screens to display the whole image. ; If you want to display the whole image, you need to download more than {1;2} Tb. traffic and use 800 HD displays }
示例输出可以是
This beautiful image, captured with the NASA Hubble Telescope, is the
sharpest image ever taken of the Andromeda galaxy — otherwise known as
M31.
This is a cropped version of the full image and has 1.5 billion
pixels. You would need more than 700 HD television screens to display
the whole image.
不贪心是个好主意,但效果不佳。你可以添加一个循环:
perl -plE 'while(s!\{([^{}]*)\}!@x=split/;/,;$x[rand@x]!ge){}'
请注意,您的样本输入有不匹配的大括号,因此这似乎输出了虚假的“}”
不错的挑战。你需要做的是找到一套没有内牙套的牙套,然后从中随机挑选一件。你需要在全球范围内这样做。这将仅替换 "level 1" 大括号。您需要遍历字符串,直到找不到更多匹配项。
use v5.18;
use strict;
use warnings;
sub rand_sentence {
my $copy = shift;
1 while $copy =~ s{ \{ ([^{}]+) \} }
{ my @words = split /;/, ; $words[rand @words] }xsge;
return $copy;
}
my $str = 'I have { {red;green;orange} fruit ; cup of {tea;coffee;juice} } and {nice;fresh} {sandwich;burger}.';
say rand_sentence($str);
say '';
$str = <<'END';
This {beautiful;perfect} {image;photography}, captured with the { {NASA;ESA}
Hubble Telescope ; {NASA;ESA} Hubble Space Telescope }, is the
{largest;sharpest} image ever taken of the Andromeda galaxy { {— otherwise
known as M31;— known as M31}; [empty here] }. This is a cropped version of the
full image and has 1.5 billion pixels. { You would need more than {600;700;800}
HD television screens to display the whole image. ; If you want to display the
whole image, you need to download more than {1;2} Tb. traffic and use 800 HD
displays }
END
say rand_sentence($str);
示例输出
I have orange fruit and fresh sandwich.
This beautiful photography, captured with the ESA Hubble Space Telescope , is the
largest image ever taken of the Andromeda galaxy — otherwise
known as M31. This is a cropped version of the
full image and has 1.5 billion pixels. If you want to display the
whole image, you need to download more than 1 Tb. traffic and use 800 HD
displays
TXR解决。有很多方法可以解决这个问题。
假设我们正在从标准输入中读取数据。我们如何读取记录中的数据,这些数据不是由通常的换行符分隔,而是由大括号选择模式分隔?我们通过在标准输入流上创建一个记录适配器对象来做到这一点。 record-adapter
函数的第三个参数是一个布尔值,表示我们要保留终止定界符(与记录定界正则表达式匹配的部分)。
因此,如果数据看起来像这样 foo bar {bra;ces} xyzzy {a;b;c} d\n
,它会变成这些记录:foo bar {bra;ces}
、xyzzy {a;b;c}
和 d\n
。
然后,我们使用提取语言将这些记录作为文本行进行处理。它们分为两种模式:以大括号模式结尾的行和不以大括号模式结尾的行。后者只是回应。前者按随机大括号替换的要求处理。
我们还初始化 *random-state*
so that the PRNG is seeded to produce a different pseudo-random sequence on each run. If make-random-state
没有给定参数,它创建一个随机状态对象,该对象从进程 ID 和系统时间等系统参数初始化:
@(do (set *random-state* (make-random-state)))
@(next @(record-adapter #/{[\w;]+}/ *stdin* t))
@(repeat)
@ (cases)
@*text{@switch}
@ (do (put-string `@text@(first (shuffle (split-str switch ";")))`))
@ (or)
@text
@ (do (put-string text))
@ (end)
@(end)
测试运行:
$ cat data I have {red;green;orange} fruit and cup of {tea;coffee;juice}. $ txr rndchoose.txr < data I have red fruit and cup of tea. $ txr rndchoose.txr < data I have orange fruit and cup of tea. $ txr rndchoose.txr < data I have green fruit and cup of coffee.