如何使用正则表达式将“10 美元”替换为“10 美元”?

How to replace "$10" with "10 dollars" using regex?

我有如下一些短语:

This is not my spending '', this is companys spending: '0 million' and this is some other figure: '0,000'.

我想删除美元符号并在短语末尾添加“dollar”,如下所示:

This is not my spending '10 dollars', this is companys spending: '250 million dollars' and this is some other figure: '200000 dollars'.

我现在有匹配的正则表达式([£$€][\s\d,\d]+(|million|billion|trillion)),但我无法正确地获得替换部分。

我该怎么做?

如果所有字符串都以“$”开头,则无需使用正则表达式。只需 select 他们在第二个字符处加上“[1:]”并在末尾添加“dollars”。例如,如果您的字符串存储在名为 a:

的变量中
a[1:] + " dollars"

只是 re.sub 的例子:

t1 = ""
t2 = "0 million"
t3 = "0,000"

sub_pattern = "/$|," #Look for dollar signs or commas
tail = " dollars"
re.sub(sub_pattern,"",t1) + tail -> 10 dollars
re.sub(sub_pattern,"",t2) + tail -> 250 million dollars
re.sub(sub_pattern,"",t3) + tail -> 200000 dollars

由于您的正则表达式还包含英镑和欧元的符号,我假设并非所有符号都以 $ 开头。然后您可以使用 re.sub 和回调函数来确定要使用的货币。如果货币出现在文本中间,这也适用。

import re
p = "([£$€])\s?([,\d]+(?: million| billion| trillion|))"
d = {"$": "dollars", "£": "pounds", "€": "euros"}

text = "I have  and £3 million and €100,000 trillion"
print(re.sub(p, lambda m: f"{m.group(2)} {d[m.group(1)]}", text))
# I have 10 dollars and 3 million pounds and 100,000 trillion euros

另请注意正则表达式的一些细微变化:我将货币符号放在一个组中,以便稍后访问,并将“空”后缀放在最后,否则会先贪婪地匹配 none 的其他人。还有,[...]不用再放两次\d,最好把space移到后缀部分。

您可以使用以下函数来实现您描述的内容。

import re

def adjust_dollars(text):
  text = re.sub(r'^$', '', text)
  text = re.sub(r'(.$)', r' dollars', text)
  return text

测试运行:

words = ['', '0 million', '0,000']
result = map(adjust_dollars, words)
print(list(result))

输出:

['10 dollars', '250 million dollars', '200,000 dollars']