从字符串中提取所有带小数的数字

Extract All Numbers with Decimals from String

我正在尝试解析文件中的行,特别是提取任何带小数的数字,因为这些数字表明它们是美元值。到目前为止,我得到了以下信息:

sample_text = " JAN 01 19 SOME OTHER STRING         .25    1.56           12,345.67"
print(re.findall("\d+\.\d+", re.sub(",", "", sample_text)))  # Find numbers with decimals in them
print(len(re.findall("\d+\.\d+", re.sub(",", "", sample_text))))

上面的输出是:

['1.56', '12345.67']
2

所以“.25”被忽略了,我想是因为它没有前导零。当我添加一个前导零时,它似乎可以工作,但问题是,我正在读取的文件非常大并且有很多文件,我不想在所有非 -所有文件中的前导零小数:

sample_text = " JAN 01 19 SOME OTHER STRING         0.25    1.56           12,345.67"
print(re.findall("\d+\.\d+", re.sub(",", "", sample_text)))  # Find numbers with decimals in them
print(len(re.findall("\d+\.\d+", re.sub(",", "", sample_text)))) 

输出:

['0.25', '1.56', '12345.67']
3

我确实尝试了以下方法在没有前导零的情况下向小数点添加前导零,但它没有给我想要的结果:

sample_text = re.sub(",", "", sample_text)
print(sample_text)
sample_text = re.sub(" .", "0.", sample_text)
print(sample_text)
print(re.findall("\d+\.\d+", re.sub(",", "", sample_text)))  # Find numbers with decimals in them
print(len(re.findall("\d+\.\d+", re.sub(",", "", sample_text))))

输出:

 JAN 01 19 SOME OTHER STRING         .25    1.56           12345.67
0.AN0.10.90.OME0.THER0.TRING0.0.0.0.0.250.0.1.560.0.0.0.0.0.2345.67
['0.10', '0.0', '0.0', '0.250', '0.1', '560.0', '0.0', '0.0', '2345.67']
9

如果数字不存在,您可以使用 * 而不是点前的 +

out = re.findall("\d*\.\d+", re.sub(",", "", sample_text))

输出:

['.25', '1.56', '12345.67']