如何根据文中的前三位数字找出10位数字? Python
how do I find a 10-digit number according to the first three numbers in the text? Python
我需要找到文本中以特定数字系列开头的所有 10 位数字。有个例子:
a_string = "一些文本 6401104219 和 6401104202 和 2201104202"
匹配 = ["240", "880", "898", "910", "920", "960", "209", "309", "409", " 471", "640"]
结果为:6401104219, 6401104202
您可以使用正则表达式和 str.startswith
:
import re
result = [s for s in re.findall(r"\d{10}", a_string) if any(map(s.startswith, matches))]
# ['6401104219', '6401104202']
如果你知道前缀都是3位长,你可以做得更好:
matches = set(matches)
result = [s for s in re.findall(r"\d{10}", a_string) if s[:3] in matches]
如果您想排除较长号码的可能的 10 位前缀,则必须将正则表达式更改为 r"\b(\d{10})\b"
。
你可以正则表达式。
- 找出所有的 10 位数字
- 过滤掉匹配列表中给定元素开始的数字
代码:
import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202"
matches = ["240", "880", "898", "910", "920",
"960", "209", "309", "409", "471", "640"]
match = re.findall(r'\d{10}', a_string) # finding all the 10 digit numbers
# filtering out the numbers which starts from the given elements in matches
ans = [i for i in match if any(map(i.startswith, matches))]
# OR
# ans = [i for i in match if i[:3] in matches] # if lenght is 3 only then simply check its existence in list
print(ans)
# ['6401104219', '6401104202']
可以直接使用re.
import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and 640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
result = re.findall(r"\b(?:" + r"|".join(matches)+r")\d{7}\b", a_string)
print(result)
# ['6401104219', '6401104202']
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and 640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
a_string_list=a_string.split(' ')
for i in a_string_list:
for j in matches:
if i.startswith(j) and len(i)==10:
print(i)
break
我需要找到文本中以特定数字系列开头的所有 10 位数字。有个例子:
a_string = "一些文本 6401104219 和 6401104202 和 2201104202"
匹配 = ["240", "880", "898", "910", "920", "960", "209", "309", "409", " 471", "640"]
结果为:6401104219, 6401104202
您可以使用正则表达式和 str.startswith
:
import re
result = [s for s in re.findall(r"\d{10}", a_string) if any(map(s.startswith, matches))]
# ['6401104219', '6401104202']
如果你知道前缀都是3位长,你可以做得更好:
matches = set(matches)
result = [s for s in re.findall(r"\d{10}", a_string) if s[:3] in matches]
如果您想排除较长号码的可能的 10 位前缀,则必须将正则表达式更改为 r"\b(\d{10})\b"
。
你可以正则表达式。
- 找出所有的 10 位数字
- 过滤掉匹配列表中给定元素开始的数字
代码:
import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202"
matches = ["240", "880", "898", "910", "920",
"960", "209", "309", "409", "471", "640"]
match = re.findall(r'\d{10}', a_string) # finding all the 10 digit numbers
# filtering out the numbers which starts from the given elements in matches
ans = [i for i in match if any(map(i.startswith, matches))]
# OR
# ans = [i for i in match if i[:3] in matches] # if lenght is 3 only then simply check its existence in list
print(ans)
# ['6401104219', '6401104202']
可以直接使用re.
import re
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and 640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
result = re.findall(r"\b(?:" + r"|".join(matches)+r")\d{7}\b", a_string)
print(result)
# ['6401104219', '6401104202']
a_string = "Some text 6401104219 and 6401104202 and 2201104202 and 640110420212"
matches = ["240", "880", "898", "910", "920", "960", "209", "309", "409", "471", "640"]
a_string_list=a_string.split(' ')
for i in a_string_list:
for j in matches:
if i.startswith(j) and len(i)==10:
print(i)
break