如何在 for LOOP 中使用 2 种不同的方法拆分列表?
How do i split list using 2 different method in for LOOP?
对不起,如果我不能正确地提出问题。但这是我的代码
data1 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG']
data2 = ['TOOK2231515100HG','BOGGOK2221643200GH']
for i in data1:
splt_1 = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
print('data1:', splt_1)
for I in data2:
splt_2 = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print('data2:', splt_2)
输出结果
data1: ['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
data1: ['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']
data2: ['', 'TOOK', '22315', '15100', 'HG', '']
data2: ['', 'BOGGOK', '22216', '43200', 'GH', '']
我想做什么?
如果
data = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG', 'TOOK2231515100HG','BOGGOK2221643200GH']
我希望能够使用 2 方法循环和拆分数据列表
re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i) or
re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]
P.S: 输出结果可以是相同或相似的格式
我试过这个代码
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print(dk)
结果
['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', None, None, None, None, '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', None, None, None, None, '']
['', None, None, None, None, None, None, 'TOOK', '22315', '15100', 'HG', '']
['', None, None, None, None, None, None, 'BOGGOK', '22216', '43200', 'GH', '']
我想要的结果
['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']
['', 'TOOK', '22315', '15100', 'HG', '']
['', 'BOGGOK', '22216', '43200', 'GH', '']
or
['TOOK', '22', 'JAN', '15', '15100', 'HG']
['BOGGOK', '22', 'MAR', '17', '42200', 'HG']
['TOOK', '22315', '15100', 'HG']
['BOGGOK', '22216', '43200', 'GH']
感谢您花时间回答我的问题..
非常感谢。
您可以删除代码中的所有 None 值:
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
dk = list(filter(lambda a: a != None, dk)) #This removes all the None values from your list
print(dk)
我有两个想法给你:
import re
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
# first approach (ugly): keep the current, simple code, then get rid of Nones
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
dk = list([s for s in dk if s != None])
print(dk)
# second approach: condition to find out which case holds
for i in data5:
dk = None
if re.search(r'JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC', i):
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
else:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print(dk)
尝试使用过滤功能从列表中删除 None:
dk = list(filter(lambda x:x!=None, re.split(Your reg expression here), i))
你得到 None
因为有两个捕获组,如果它们都不匹配,那么 None
进入列表。
第一个解决方案:
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in range(len(data5)):
data5[i] = [item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]
print(data5)
第二种解法:
def resultant_string_list(data5):
return [([item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]) for i in range(len(data5))]
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
print(resultant_string_list(data5))
以上代码的输出:
[['TOOK', '22', 'JAN', '15', '15100', 'HG'], ['BOGGOK', '22', 'MAR', '17', '42200', 'HG'], ['TOOK', '22315', '15100', 'HG'], ['BOGGOK', '22216', '43200', 'GH']]
我在编写代码时牢记使用的 space,因此您当前的变量会被结果列表替换。
示例:
Before: data5[0] = "TOOK22JAN1515100HG"
After: data5[0] = ['TOOK', '22', 'JAN', '15', '15100', 'HG']
对不起,如果我不能正确地提出问题。但这是我的代码
data1 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG']
data2 = ['TOOK2231515100HG','BOGGOK2221643200GH']
for i in data1:
splt_1 = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
print('data1:', splt_1)
for I in data2:
splt_2 = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print('data2:', splt_2)
输出结果
data1: ['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
data1: ['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']
data2: ['', 'TOOK', '22315', '15100', 'HG', '']
data2: ['', 'BOGGOK', '22216', '43200', 'GH', '']
我想做什么?
如果
data = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG', 'TOOK2231515100HG','BOGGOK2221643200GH']
我希望能够使用 2 方法循环和拆分数据列表
re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i) or
re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]
P.S: 输出结果可以是相同或相似的格式
我试过这个代码
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print(dk)
结果
['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', None, None, None, None, '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', None, None, None, None, '']
['', None, None, None, None, None, None, 'TOOK', '22315', '15100', 'HG', '']
['', None, None, None, None, None, None, 'BOGGOK', '22216', '43200', 'GH', '']
我想要的结果
['', 'TOOK', '22', 'JAN', '15', '15100', 'HG', '']
['', 'BOGGOK', '22', 'MAR', '17', '42200', 'HG', '']
['', 'TOOK', '22315', '15100', 'HG', '']
['', 'BOGGOK', '22216', '43200', 'GH', '']
or
['TOOK', '22', 'JAN', '15', '15100', 'HG']
['BOGGOK', '22', 'MAR', '17', '42200', 'HG']
['TOOK', '22315', '15100', 'HG']
['BOGGOK', '22216', '43200', 'GH']
感谢您花时间回答我的问题.. 非常感谢。
您可以删除代码中的所有 None 值:
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
dk = list(filter(lambda a: a != None, dk)) #This removes all the None values from your list
print(dk)
我有两个想法给你:
import re
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
# first approach (ugly): keep the current, simple code, then get rid of Nones
for i in data5:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
dk = list([s for s in dk if s != None])
print(dk)
# second approach: condition to find out which case holds
for i in data5:
dk = None
if re.search(r'JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC', i):
dk = re.split(r'(TOOK|BOGGOK)([0-9]{2})(JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC)([0-9]{2})([0-9]{5})(HG|GH)', i)
else:
dk = re.split(r'(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', i)
print(dk)
尝试使用过滤功能从列表中删除 None:
dk = list(filter(lambda x:x!=None, re.split(Your reg expression here), i))
你得到 None
因为有两个捕获组,如果它们都不匹配,那么 None
进入列表。
第一个解决方案:
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
for i in range(len(data5)):
data5[i] = [item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]
print(data5)
第二种解法:
def resultant_string_list(data5):
return [([item for item in (re.split(r'(TOOK|BOGGOK)([0-9]{2})([A-Z]{3})([0-9]{2})([0-9]{5})(HG|GH)|(TOOK|BOGGOK)([0-9]{5})([0-9]{5})(HG|GH)', data5[i])) if (item is not None) and len(item)>0]) for i in range(len(data5))]
data5 = ['TOOK22JAN1515100HG','BOGGOK22MAR1742200HG','TOOK2231515100HG','BOGGOK2221643200GH']
print(resultant_string_list(data5))
以上代码的输出:
[['TOOK', '22', 'JAN', '15', '15100', 'HG'], ['BOGGOK', '22', 'MAR', '17', '42200', 'HG'], ['TOOK', '22315', '15100', 'HG'], ['BOGGOK', '22216', '43200', 'GH']]
我在编写代码时牢记使用的 space,因此您当前的变量会被结果列表替换。 示例:
Before: data5[0] = "TOOK22JAN1515100HG"
After: data5[0] = ['TOOK', '22', 'JAN', '15', '15100', 'HG']