在文本中查找德语 phone 数字

find german phone numbers in a text

我想在文本中查找 phone 个数字(以德语方式书写)。我几乎完成了我的 javascript 正则表达式来找到它们。 这是正则表达式:

(((((((00|\+)[0-9]{1,3}[ \-/]?)|0)[ ]?[1-9][0-9]{1,4})[ ]?[\-/]?[ ]?)|((((00|\+)[0-9]{1,3}[ ]?\()|\(0)(\)|[ ]?[1-9][0-9]{1,4}\))[ ]?[\-/]?[ ]?))[0-9]{1,7}([ \-/]?[0-9]{1,5}){0,4})

但现在我想补充两件事:

  1. 它应该排除 phone 个少于 9 位的数字。我尝试过了 在整个表达式前面添加 (?=(.*?\d){9,}) 。那个解决方案 适用于 01234 但不适用于 +33 (33) 3) 如果还有更多 数字如下。那么我的错误是什么?
  2. 它应该排除没有任何空格的数字(如 +49123123123)。我怎样才能意识到这一点?

为了更好地理解我的意图,我准备了一个演示:Regex101

测试用例:

+33 (33) 3)44444444444 //the found string has only 5 digits, but it shouldn't be found because of (?=(.*?\d){9,})
+49123123123 //how can I exclude that, because there is no white space in the middle

// this is the part where I test all the other phone numbers, if you are interested in:

//it should match these phone numbers:
testword +49 30 12345-67 testword 
testword+49 (0)30 12345-67
(0)30 12345-67
(0)30 123 234
(0123)30 12345-67
test (021)30 123 234
s030 12345-67 dsd
(030) 12345 55 99testword 
testword   (030) 12345 44
0351 4640-123
09623 12 3 33
09234 1233
+49 123 1 2 12 31
0049 2123 1231
+1 3519 1231
0 30 / 12 34 56
0 30 / 12 34 56
030 / 12 34 56
0123 / 12312 123
testword  0178 1232231
+490 178 1232231
testword +36 (351)4740-991 testword
testword +36(351) 4740-991 testword
09623 12333 testword

should NOT match (with the reason why it shouldn't match):
+49123123123 //because there is no white space
01781232231 //because there is no white space
123456 //because it doesn't have at least 9 digits and no white space
123.123 //because it has a dot
12-12-12 //because there is no white space and there is more than one dash
12-12 -12-12-12 //because there is more than one dash
1990 - 2000 //because it doesn't have at least 9 digits
1990-2000 //because it doesn't have at least 9 digits and no white space
1990-91 //because it doesn't have at least 9 digits and no white space
123 //because it doesn't have at least 9 digits and no white space
+36 (351) 47(40-991 //because it has more than one left bracket
+36 (33) 3)4444
)40-991 //because it has more than one right bracket
+23+234 +2346 // because it has more than one plus sign
234 234 234 234   234 // because it has more than one white space in a row
123   123123 // because it has more than one white space in a row
01712123123
01234

我建议使用 2 种模式
第一个捕获允许的字符:

(?!\s)(\+?[0-9 .\/()-]{8,}\d)\b        # optional `+`, 8 or more allowed chr's, digit

Demo

然后第 2 个捕获您的条件:

^(?=.*\h)                              # has at least one whitespace
(?!.*\h\h)                             # does not see double whitespaces
(?!.*-.*-)                             # does not see more than one `-`
(?=(?:.*\d){9,})                       # has at least 9 digits
([^(\r\n]*\(?[^)\r\n]+\)?[^()\r\n]+)$  # sees optional proper parentheses

Demo