根据可能出现多次的关键字拆分列表

Split a list on a key word which may appear multiple times

我读过的例子看起来很相似,但我还没有达到理解答案的水平。我想把列表输出,把每个接口单独写成一行(aka list I write to a csv)。我需要根据关键字 'interface Vlan*'

拆分初始 return 列表

我想将关键字接口 vlan* 上的 returned 列表 vlanlist 拆分为单独的列表

from ciscoconfparse import CiscoConfParse
import os

for filename in os.listdir():
    if filename.endswith(".cfg"):
        p = CiscoConfParse(filename)
        vlanlist=(p.find_all_children('^interface Vlan'))
        vlanlist.insert(0,filename)

        print(vlanlist) 

这是一行输出。我需要将关键字 "interface vlanxxx" 上的列表拆分为单独的行

[ 'interface Vlan1', ' no ip address', ' shutdown', 'interface Vlan2003', ' description XXXXXX', ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects', ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']

所需的输出(这可能有 2-20 个不同的接口,我想根据配置文件拆分)

['interface Vlan1' ' no ip address', ' shutdown']
['interface Vlan2003', ' description XXXXXX', ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects', ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']

这是一个与您的单个测试用例高度耦合的解决方案。如果完整数据集不能代表您的单个测试用例,您将不得不通过更多测试来改进它。

def extract(items):
  result, filename, idx = [], items[0], -1

  for x in items[1:]:
    if x.startswith('interface Vlan'):
      idx += 1
      result.append([filename])
    result[idx].append(x)

  return result

# given & expected are your example and output 
assert expected == extract(given)

编辑:
...并且您已经更改了输入和输出。

def extract(items):
  result, idx = [], -1

  for x in items:
    if x.startswith('interface Vlan'):
      idx += 1
      result.append([])

    if not result: continue  # assuming possible unwanted items before 'interface Vlan'
    result[idx].append(x)

  return result

assert expected == extract(given)

这会标识一个拆分点,并将您的列表分成指定的两个列表。 split_pos 列表将找到所有拆分位置;如果有多个分割点,你可以遍历它。拆分条件查找以给定文本和至少三个字符开头的字符串,即您发布的 "xxx"。

vlanlist = ['sw01.cfg', 'interface Vlan1', ' no ip address', ' shutdown', 'interface Vlan2003', ' description XXXXXX', ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects', ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']
target = "interface Vlan"

split_pos = [idx for idx, str in enumerate(vlanlist) if str.startswith(target) and \
                                                        len(str) >= len(target)+3][0]

out1 = [vlanlist[0]] + vlanlist[1:split_pos]
out2 = [vlanlist[0]] + vlanlist[split_pos:]

print(out1)
print(out2)

输出:

['sw01.cfg', 'interface Vlan1', ' no ip address', ' shutdown']
['sw01.cfg', 'interface Vlan2003', ' description XXXXXX', 
 ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects',
 ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']

您可以在附加文件名之前进一步分隔返回的 vlanlist

# First, find the index in the list where "interface Vlan" exists:
# Also, append None at the end to signify index for end of list
indices = [i for i, v in enumerate(l) if v.startswith('interface Vlan')] + [None]

# [0, 3, None]

# Then, create the list of lists based on the extracted indices and prepend with filename
newlist = [[filename] + vlanlist[indices[i]:indices[i+1]] for i in range(len(indices)-1)]

for l in newlist: print(l)

# ['test.cfg', 'interface Vlan1', ' no ip address', ' shutdown']
# ['test.cfg', 'interface Vlan2003', ' description XXXXXX', ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects', ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']

第二个列表理解的解释:

newlist = [
    [filename] +                   # prepend single-item list of filename
    vlanlist[                      # slice vlanlist
        indices[i]:                # starting at the current index
        indices[i+1]               # up to the next index
    ] 
    for i in range(len(indices)-1) # iterate up to the second last index so i+1 doesn't become IndexError
]

如果您不喜欢索引方法,可以尝试 zip

lists = [[filename] + vlanlist[start:end] for start, end in zip(indices[:-1], indices[1:])]

一个快速直接的解决方案。检查列表中是否有 interface Vlan 项,如果是,它会创建一个新列表,否则会附加到旧列表和一些 .strip() 以备不时之需。

output = ['interface Vlan1', ' no ip address', ' shutdown', 'interface Vlan2003', ' description XXXXXX', ' ip address 10.224.6.130 255.255.255.224', ' no ip redirects', ' no ip unreachables', ' no ip proxy-arp', ' load-interval 60', ' arp timeout 420']

results = []

for i in output:
    if 'interface Vlan' in i:
        results.append([i.strip()])
    else:
        results[-1].append(i.strip())

>> results
 [['interface Vlan1', 'no ip address', 'shutdown'],
 ['interface Vlan2003',
  'description XXXXXX',
  'ip address 10.224.6.130 255.255.255.224',
  'no ip redirects',
  'no ip unreachables',
  'no ip proxy-arp',
  'load-interval 60',
  'arp timeout 420']]