如何从 python 中带有额外空格的列表中提取数据

How to extract data from a list with addtional spaces in between them in python

代码正在尝试从文件中提取:(格式:group, team, val1, val2)。但是,如果没有额外的 space 并且在中间有额外 space 的行中产生错误结果,则某些结果是正确的。

data = {}
with open('source.txt') as f:
    for line in f:
        print ("this is the line data: ", line)
        
        needed = line.split()[0:2]
        print ("this is what i need: ", needed)

source.txt #-- 格式:group, team, val1, val2

alpha diehard group 1 54,00.01
bravo nevermindteam 3 500,000.00
charlie team ultimatum 1 27,722.29 (0.45)
charlie team ultimatum 10 252,336,733.383 (2.06)
delta beyond-imagination 2 11 ()
echo double doubt 5 143,299.00 (1)
echo double doubt 8 145,300 (5.01)
falcon revengers 3 0.1234
falcon revengers 5 9.19
lima almost done 6 45.00181 (.9)
romeo ontheway home 12 980

我试图只提取 val1 之前的值。 #-- 小组,团队

alpha diehard group
bravo nevermindteam
charlie team ultimatum
delta beyond-imagination
echo double doubt
falcon revengers
lima almost done
romeo ontheway home

试试

with open('source.txt') as f:
   for line in f:
       new_line = ' '.join(filter(lambda s: s.isalpha() , l.split(' ')))
       print(new_line)

代码对空格的数量很敏感。

使用正则表达式

import regex

with open('source.txt', 'r') as f:
    text = re.sub(r'[0-9|,|\.|\(|\)|$|\s]+\n', '\n', f.read()+'\n', re.M)

我是这样做的,基本上遍历所有单词并在您点击数字时停止:

data = {}
with open('source.txt') as f:
    for line in f:
        print ("this is the line data: ", line)
        
        split_line = line.split()
        for i in range (len(split_line)):
            if split_line[i].isnumeric():
                break
        
        needed = split_line[0:i]
        
        print ("this is what i need: ", needed)

使用正则表达式。

import regex as re
with open('source.txt') as f:
   for line in f:
       found = re.search("(.*?)\d", line)
       needed = found.group(1).split()[0:3]
       print(needed)

输出:

['alpha', 'diehard', 'group']
['bravo', 'nevermindteam']
['charlie', 'team', 'ultimatum']
['charlie', 'team', 'ultimatum']
['delta', 'beyond-imagination']
['echo', 'double', 'doubt']
['echo', 'double', 'doubt']
['falcon', 'revengers']
['falcon', 'revengers']
['lima', 'almost', 'done']
['romeo', 'ontheway', 'home']