如何使用正则表达式将“10 美元”替换为“10 美元”?
How to replace "$10" with "10 dollars" using regex?
我有如下一些短语:
This is not my spending '', this is companys spending: '0 million' and this is some other figure: '0,000'.
我想删除美元符号并在短语末尾添加“dollar”,如下所示:
This is not my spending '10 dollars', this is companys spending: '250 million dollars' and this is some other figure: '200000 dollars'.
我现在有匹配的正则表达式([£$€][\s\d,\d]+(|million|billion|trillion)
),但我无法正确地获得替换部分。
我该怎么做?
如果所有字符串都以“$”开头,则无需使用正则表达式。只需 select 他们在第二个字符处加上“[1:]”并在末尾添加“dollars”。例如,如果您的字符串存储在名为 a:
的变量中
a[1:] + " dollars"
只是 re.sub 的例子:
t1 = ""
t2 = "0 million"
t3 = "0,000"
sub_pattern = "/$|," #Look for dollar signs or commas
tail = " dollars"
re.sub(sub_pattern,"",t1) + tail -> 10 dollars
re.sub(sub_pattern,"",t2) + tail -> 250 million dollars
re.sub(sub_pattern,"",t3) + tail -> 200000 dollars
由于您的正则表达式还包含英镑和欧元的符号,我假设并非所有符号都以 $
开头。然后您可以使用 re.sub
和回调函数来确定要使用的货币。如果货币出现在文本中间,这也适用。
import re
p = "([£$€])\s?([,\d]+(?: million| billion| trillion|))"
d = {"$": "dollars", "£": "pounds", "€": "euros"}
text = "I have and £3 million and €100,000 trillion"
print(re.sub(p, lambda m: f"{m.group(2)} {d[m.group(1)]}", text))
# I have 10 dollars and 3 million pounds and 100,000 trillion euros
另请注意正则表达式的一些细微变化:我将货币符号放在一个组中,以便稍后访问,并将“空”后缀放在最后,否则会先贪婪地匹配 none 的其他人。还有,[...]
不用再放两次\d
,最好把space移到后缀部分。
您可以使用以下函数来实现您描述的内容。
import re
def adjust_dollars(text):
text = re.sub(r'^$', '', text)
text = re.sub(r'(.$)', r' dollars', text)
return text
测试运行:
words = ['', '0 million', '0,000']
result = map(adjust_dollars, words)
print(list(result))
输出:
['10 dollars', '250 million dollars', '200,000 dollars']
我有如下一些短语:
This is not my spending '', this is companys spending: '0 million' and this is some other figure: '0,000'.
我想删除美元符号并在短语末尾添加“dollar”,如下所示:
This is not my spending '10 dollars', this is companys spending: '250 million dollars' and this is some other figure: '200000 dollars'.
我现在有匹配的正则表达式([£$€][\s\d,\d]+(|million|billion|trillion)
),但我无法正确地获得替换部分。
我该怎么做?
如果所有字符串都以“$”开头,则无需使用正则表达式。只需 select 他们在第二个字符处加上“[1:]”并在末尾添加“dollars”。例如,如果您的字符串存储在名为 a:
的变量中a[1:] + " dollars"
只是 re.sub 的例子:
t1 = ""
t2 = "0 million"
t3 = "0,000"
sub_pattern = "/$|," #Look for dollar signs or commas
tail = " dollars"
re.sub(sub_pattern,"",t1) + tail -> 10 dollars
re.sub(sub_pattern,"",t2) + tail -> 250 million dollars
re.sub(sub_pattern,"",t3) + tail -> 200000 dollars
由于您的正则表达式还包含英镑和欧元的符号,我假设并非所有符号都以 $
开头。然后您可以使用 re.sub
和回调函数来确定要使用的货币。如果货币出现在文本中间,这也适用。
import re
p = "([£$€])\s?([,\d]+(?: million| billion| trillion|))"
d = {"$": "dollars", "£": "pounds", "€": "euros"}
text = "I have and £3 million and €100,000 trillion"
print(re.sub(p, lambda m: f"{m.group(2)} {d[m.group(1)]}", text))
# I have 10 dollars and 3 million pounds and 100,000 trillion euros
另请注意正则表达式的一些细微变化:我将货币符号放在一个组中,以便稍后访问,并将“空”后缀放在最后,否则会先贪婪地匹配 none 的其他人。还有,[...]
不用再放两次\d
,最好把space移到后缀部分。
您可以使用以下函数来实现您描述的内容。
import re
def adjust_dollars(text):
text = re.sub(r'^$', '', text)
text = re.sub(r'(.$)', r' dollars', text)
return text
测试运行:
words = ['', '0 million', '0,000']
result = map(adjust_dollars, words)
print(list(result))
输出:
['10 dollars', '250 million dollars', '200,000 dollars']