Python 回复:如何匹配任何数字,即使是带有逗号和小数的数字?
Python re: How to match any numbers even those with comma & decimal?
我想匹配任何包含小数、逗号或整数的数字。我尝试了下面的方法,但如果数字有 > 2 个逗号,我的正则表达式将无法匹配。
谢谢
import re
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num=re.findall(r'\b\d+[.,]*\d+[,]*d*\b', string1)
Result:
['6,111,123', '999', '5,450,900', '10.32', '31', '2019']
Desired Outcome --> ['6,111,123,999', '5,450,900', '10.32', '31', '2019']
尝试(?:[\d]+[.,]?\b)+
它匹配任何可能包含小数、逗号或简单的整数的数字
代码:
import re
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num = re.findall(r'(?:\b[\d]+[.,]?\b)+', string1)
print(num)
输出:
['6,111,123,999', '5,450,900', '10.32', '31', '2019']
告诉我是否适合你...
匹配所有数字
你可以使用 \d(?:[\d,.]*\d+)?
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019 1"
import re
re.findall(r'\d(?:[\d,.]*\d+)?', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '21', '31', '2019', '1']
仅匹配独立单词的数字
使用\b[\d,.]*\d+\b
:
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019 1"
import re
re.findall(r'\b[\d,.]*\d+\b', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']
编辑:仅匹配 space、字符串结尾或逗号作为分隔符
string1= "6,111,123,999 5,450,900 10.32 1a2 1-2 OCT21 Dec 31, 2019 1"
import re
re.findall(r'(?:(?<=^)|(?<=\s))[\d,.]*\d+(?=$|\s|,)', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']
这里是我对你想要的结果的快速修复,它是正则表达式(我发现它会产生很多代码噪音)和一些简单的对象操作的混合体。不是很全面,但确实如此:
import re
string_one = "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num=[re.findall(r'^[0-9,.]+', el.rstrip(',')) for el in string_one.split()]
# give a proper result
result = [item for sublist in num for item in sublist]
print(result)
# ['6,111,123,999', '5,450,900', '10.32', '31', '2019']
我想匹配任何包含小数、逗号或整数的数字。我尝试了下面的方法,但如果数字有 > 2 个逗号,我的正则表达式将无法匹配。
谢谢
import re
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num=re.findall(r'\b\d+[.,]*\d+[,]*d*\b', string1)
Result:
['6,111,123', '999', '5,450,900', '10.32', '31', '2019']
Desired Outcome --> ['6,111,123,999', '5,450,900', '10.32', '31', '2019']
尝试(?:[\d]+[.,]?\b)+
它匹配任何可能包含小数、逗号或简单的整数的数字
代码:
import re
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num = re.findall(r'(?:\b[\d]+[.,]?\b)+', string1)
print(num)
输出:
['6,111,123,999', '5,450,900', '10.32', '31', '2019']
告诉我是否适合你...
匹配所有数字
你可以使用 \d(?:[\d,.]*\d+)?
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019 1"
import re
re.findall(r'\d(?:[\d,.]*\d+)?', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '21', '31', '2019', '1']
仅匹配独立单词的数字
使用\b[\d,.]*\d+\b
:
string1= "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019 1"
import re
re.findall(r'\b[\d,.]*\d+\b', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']
编辑:仅匹配 space、字符串结尾或逗号作为分隔符
string1= "6,111,123,999 5,450,900 10.32 1a2 1-2 OCT21 Dec 31, 2019 1"
import re
re.findall(r'(?:(?<=^)|(?<=\s))[\d,.]*\d+(?=$|\s|,)', string1)
输出:['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']
这里是我对你想要的结果的快速修复,它是正则表达式(我发现它会产生很多代码噪音)和一些简单的对象操作的混合体。不是很全面,但确实如此:
import re
string_one = "6,111,123,999 5,450,900 10.32 OCT21 Dec 31, 2019"
num=[re.findall(r'^[0-9,.]+', el.rstrip(',')) for el in string_one.split()]
# give a proper result
result = [item for sublist in num for item in sublist]
print(result)
# ['6,111,123,999', '5,450,900', '10.32', '31', '2019']