如何提取包含特殊字符的字符串的一部分(数字)

How can I extract a part(number) of a string which contains special characters

希望你能帮我从字符串中提取数字。

我通过两种可能的方式获取字符串:

  1. y = 所需数量“60”
  2. x = 具有特殊字符的字符串中的必需数字,开头可能还有另一个数字

此示例列出了 x (x1 - x7) 的可能变体,

我需要最后提取的号码:

=> 在这种情况下为“60”(例外 x3 = 50)

我尝试使用正则表达式拆分和剥离功能。但不幸的是,它不适用于所有变体:

我必须更改什么才能使其适用于所有变体?

import re

b=[]
y="60"

# x-x2: split and strip function is working => b="60"
x = "5: 60 USD"
x1= "5.  USD"
x2= "5- 60 USD"

# x3-x7: split and strip function is NOT working
x3 ="5: 50 USD"
x4 ="5 : 60 USD"
x5 ="5 . 60 USD"
x6 ="5 -  USD"
x7 ="5: USD"


a,b = re.split('5: |5. |5-',x)

b = b.upper().strip(' -§$%&€ABCDEFGHIJKLMNOPQRSTUVWXYZ:')

print(b)

#b should be 60 each time (exeption x3 = 50)
import re


x = re.sub('[^0-9][.]{0,1}[^0-9]', " ", x)
x = re.sub('USD', "", x)

try:
    b = x.split()[1]
except:
    b = ".".join(x.split(".")[1:])

 

完整代码:

import re

b=[]
y="60"

# x-x2: split and strip function is working => b="60"
x0 = "5: 60 USD"
x1= "5.  USD"
x2= "5- 60 USD"

# x3-x7: split and strip function is NOT working
x3 ="5: 50 USD"
x4 ="5 : 60 USD"
x5 ="5 . 60 USD"
x6 ="5 -  USD"
x7 ="5:.000 USD"

x_list = [x0,x1,x2,x3,x4,x5,x6,x7]

for x in x_list:

    print ("raw "+x)

    x = re.sub('[^0-9][.]{0,1}[^0-9]', " ", x)

    b = x.split()[1]

    print ("clean "+b)

输出:

raw 5: 60 USD
clean 60
raw 5.  USD
clean 
raw 5- 60 USD
clean 60
raw 5: 50 USD
clean 50
raw 5 : 60 USD
clean 60
raw 5 . 60 USD
clean 60
raw 5 -  USD
clean 60
raw 5:.000 USD
clean 60.000

也许您的示例并不详尽,但这适用于给定的示例:

result = int(''.join([ch for ch in x[1:] if ch in '0123456789']))

或者:

result = int(''.join([ch for ch in x[1:] if ch.isdigit()]))