谁能帮助我使用 Python 中的 re.sub 从字符串中删除数字数据？

Question

我正在处理文本文件并且有如下数据。我只想从数据中删除 1 和 0.6271，而不是 T123。

page_data=1 0.6271 bacs T123 Biologically Active Substance
page_data =re.sub(r"", '  ',page_data)

要求输出：

bacs T123 Biologically Active Substance

Answer 1

正如评论所指出的那样，使用 re 可能 over-complicate 适合你，这里不是很有必要。如果不需要使用 re，对于这样的事情，你可以做一个简单的 try except 语句。

def removenumeric(string):
    newstr = []
    for word in string.split():
        try:
            float(word)
        except ValueError:
            newstr.append(word)
    return ' '.join(newstr)

输出：

bacs T123 Biologically Active Substance

您不能在此处使用 .isnumeric()，因为对于浮点数的字符串，它会 return 为 false。这就是为什么需要使用 float(word) 来创建准确的输出。

Answer 2

我阅读了@gmdev 的回答，但也想指出正则表达式的答案以备不时之需。

正则表达式（仅匹配字符串中的浮点数和整数）：

使用此正则表达式排除匹配部分（整数和浮点数）： DEMO

(^|\s)([-+]?\d*\.\d+|\d+)

Python 用法：

import re

re.sub("(^|\s)([-+]?\d*\.\d+|\d+)", '', "1 0.6271 bacs T123 Biologically Active Substance")

输入： 1 0.6271 bacs T123 生物活性物质

输出： bacs T123 生物活性物质

谁能帮助我使用 Python 中的 re.sub 从字符串中删除数字数据？

Can anyone help me remove numeric data from string by using re.sub in Python?

python

python-re