如何根据文中的前三位数字找出10位数字? Python

how do I find a 10-digit number according to the first three numbers in the text? Python

我需要找到文本中以特定数字系列开头的所有 10 位数字。有个例子:

a_string = "一些文本 6401104219 和 6401104202 和 2201104202"

匹配 = ["240", "880", "898", "910", "920", "960", "209", "309", "409", " 471", "640"]

结果为:6401104219, 6401104202

您可以使用正则表达式和 str.startswith:

import re

result = [s for s in re.findall(r"\d{10}", a_string) if any(map(s.startswith, matches))]
# ['6401104219', '6401104202']

如果你知道前缀都是3位长,你可以做得更好:

matches = set(matches)

result = [s for s in re.findall(r"\d{10}", a_string) if s[:3] in matches]

如果您想排除较长号码的可能的 10 位前缀,则必须将正则表达式更改为 r"\b(\d{10})\b"

你可以正则表达式。

  • 找出所有的 10 位数字
  • 过滤掉匹配列表中给定元素开始的数字

代码:

import re

a_string = "Some text 6401104219 and 6401104202 and 2201104202"

matches = ["240", "880", "898", "910", "920",
           "960", "209", "309", "409", "471", "640"]

match = re.findall(r'\d{10}', a_string)  # finding all the 10 digit numbers

# filtering out the numbers which starts from the given elements in matches


ans = [i for i in match if any(map(i.startswith, matches))]
# OR 
# ans = [i for i in match if i[:3] in matches] # if lenght is 3 only then simply check its existence in list
print(ans)
# ['6401104219', '6401104202'] 

可以直接使用re.

import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and    640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
result = re.findall(r"\b(?:" + r"|".join(matches)+r")\d{7}\b", a_string)

print(result)
# ['6401104219', '6401104202']
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and    640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]

a_string_list=a_string.split(' ')
for i in a_string_list:
    for j in matches:
        if i.startswith(j) and len(i)==10:
            print(i)
            break