用于在 Perl 中捕获 vcard 组的正则表达式
RegEx for capturing vcard groups in Perl
我大学这学期一直在学习句法和语义,正则表达式经常参与其中。作为一种锻炼方式,我发现了可以应用正则表达式的不同场景。考虑到 VCards 是其中之一,我一直无法指定一些东西来对 BEGIN:VCARD
和 END:VCARD
之间的所有内容进行分组
请注意,.vcf 文件使用换行符
我对此的最佳模式如下所示:(尽管我尝试了很多变体
BEGIN:VCARD\n([^(END:VCARD)\n]*END:VCARD
所以想法是:"From begin vcard read all that is not END:VCARD, and which ends with a linebreak, until end vcard is encountered"
我使用的是 perl 变体,但使用的是 vala 编程语言。
我知道问题出在我的模式上,但经过长时间的阅读和反复试验,我仍然不太确定为什么测试仪显示它不起作用。
测试数据:
BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
在我最成功的测试中,它标记了从第一个 BEGIN:VCARD
到 END:VCARD
之前的所有内容
这个表达式可能会帮助你做到这一点:
(BEGIN:VCARD([\s\S]*?)END:VCARD)
Perl 测试:
use strict;
my $str = 'BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD';
my $regex = qr/(BEGIN:VCARD([\s\S]*?)END:VCARD)/mp;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via $-[0] and $+[0]\n";
# print "Capture Group 1 is and its start/end positions can be obtained via $-[1] and $+[1]\n";
# print "Capture Group 2 is ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
正则表达式
如果这不是您想要的表达方式,您可以 modify/change 您的表达方式 regex101.com。
正则表达式电路
您还可以在 jex.im:
中可视化您的表情
我大学这学期一直在学习句法和语义,正则表达式经常参与其中。作为一种锻炼方式,我发现了可以应用正则表达式的不同场景。考虑到 VCards 是其中之一,我一直无法指定一些东西来对 BEGIN:VCARD
和 END:VCARD
请注意,.vcf 文件使用换行符
我对此的最佳模式如下所示:(尽管我尝试了很多变体
BEGIN:VCARD\n([^(END:VCARD)\n]*END:VCARD
所以想法是:"From begin vcard read all that is not END:VCARD, and which ends with a linebreak, until end vcard is encountered"
我使用的是 perl 变体,但使用的是 vala 编程语言。
我知道问题出在我的模式上,但经过长时间的阅读和反复试验,我仍然不太确定为什么测试仪显示它不起作用。
测试数据:
BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
在我最成功的测试中,它标记了从第一个 BEGIN:VCARD
到 END:VCARD
这个表达式可能会帮助你做到这一点:
(BEGIN:VCARD([\s\S]*?)END:VCARD)
Perl 测试:
use strict;
my $str = 'BEGIN:VCARD
VERSION:3.0
N:Doe;John;;;
FN:John Doe
ORG:Example.com Inc.;
TITLE:Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1212
TEL;type=WORK:+1 (617) 555-1234
TEL;type=CELL:+1 781 555 1212
TEL;type=HOME:+1 202 555 1212
NOTE:John Doe has a long and varied history\, being documented on more police files that anyone else. Reports of his death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD
BEGIN:VCARD
VERSION:3.0
N:Doe;Jane;;;
FN:Jane Doe
ORG:Example.com Inc.;
TITLE:Another Imaginary test person
EMAIL;type=INTERNET;type=WORK;type=pref:johnDoe@example.org
TEL;type=WORK;type=pref:+1 617 555 1213
TEL;type=WORK:+1 (617) 555-1233
TEL;type=CELL:+1 781 555 1213
TEL;type=HOME:+1 202 555 1213
NOTE:Jane Doe has a long and varied history\, being documented on more police files that anyone else. Reports of her death are alas numerous.
CATEGORIES:Work,Test group
X-ABUID:5AD380FD-B2DE-4261-BA99-DE1D1DB52FBE\:ABPerson
END:VCARD';
my $regex = qr/(BEGIN:VCARD([\s\S]*?)END:VCARD)/mp;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via $-[0] and $+[0]\n";
# print "Capture Group 1 is and its start/end positions can be obtained via $-[1] and $+[1]\n";
# print "Capture Group 2 is ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
正则表达式
如果这不是您想要的表达方式,您可以 modify/change 您的表达方式 regex101.com。
正则表达式电路
您还可以在 jex.im:
中可视化您的表情