如何在 for LOOP 中使用 2 种不同的方法拆分列表?

How do i split list using 2 different method in for LOOP?

对不起,如果我不能正确地提出问题。但这是我的代码

data1 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG']
data2 = ['TOOK2231515100HG','BOGGOK2221643200GH']

for i in data1:
  splt_1 = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
  print('data1:', splt_1)

for I in data2:
  splt_2 = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
  print('data2:', splt_2)

输出结果

data1: ['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
data1: ['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']

data2: ['', 'TOOK', '22315', '15100', 'HG', '']
data2: ['', 'BOGGOK', '22216', '43200', 'GH', '']

我想做什么?

如果

data = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG', 'TOOK2231515100HG','BOGGOK2221643200GH']

我希望能够使用 2 方法循环和拆分数据列表

re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i) or
re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]

P.S: 输出结果可以是相同或相似的格式

我试过这个代码

data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']

for i in data5:
  dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
  print(dk)

结果

['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', None, None, None, None, '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', None, None, None, None, '']
['', None, None, None, None, None, None, 'TOOK', '22315', '15100', 'HG', '']
['', None, None, None, None, None, None, 'BOGGOK', '22216', '43200', 'GH', '']

我想要的结果

['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']
['', 'TOOK', '22315', '15100', 'HG', '']
['', 'BOGGOK', '22216', '43200', 'GH', '']

or

['TOOK', '22', 'JAN', '15', '15100', 'HG']
['BOGGOK', '22', 'MAR', '17', '42200', 'HG']
['TOOK', '22315', '15100', 'HG']
['BOGGOK', '22216', '43200', 'GH']

感谢您花时间回答我的问题.. 非常感谢。

您可以删除代码中的所有 None 值:

data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']

for i in data5:
  dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
  dk = list(filter(lambda a: a != None, dk)) #This removes all the None values from your list
  print(dk)

我有两个想法给你:

import re

data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']

# first approach (ugly): keep the current, simple code, then get rid of Nones
for i in data5:
    dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
    dk = list([s for s in dk if s != None])
    print(dk)

# second approach: condition to find out which case holds
for i in data5:
    dk = None
    if re.search(r'JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC', i):
        dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
    else:
        dk = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
    print(dk)

尝试使用过滤功能从列表中删除 None:

dk = list(filter(lambda x:x!=None, re.split(Your reg expression here), i))

你得到 None 因为有两个捕获组,如果它们都不匹配,那么 None 进入列表。

第一个解决方案:

data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in range(len(data5)):
    data5[i] = [item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]
print(data5)

第二种解法:

def resultant_string_list(data5):
    return [([item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]) for i in range(len(data5))]

data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
print(resultant_string_list(data5))

以上代码的输出:

[['TOOK', '22', 'JAN', '15', '15100', 'HG'], ['BOGGOK', '22', 'MAR', '17', '42200', 'HG'], ['TOOK', '22315', '15100', 'HG'], ['BOGGOK', '22216', '43200', 'GH']]

我在编写代码时牢记使用的 space,因此您当前的变量会被结果列表替换。 示例

Before: data5[0] = "TOOK22JAN1515100HG"
After: data5[0] = ['TOOK', '22', 'JAN', '15', '15100', 'HG']