在 Python 中将表格 CLI 输出转换为 JSON 格式

Question

我需要在 python.

中将以下输出转换为 Json 格式

我该怎么做？

switch# sh mod
Mod  Ports  Module-Type                         Model              Status
---  -----  ----------------------------------- ------------------ ----------
1    48     1/2/4/8 Gbps FC/Supervisor-3        DS-C9148-K9-SUP    active *

Mod  Sw              Hw      World-Wide-Name(s) (WWN)
---  --------------  ------  --------------------------------------------------
1    6.2(17)         1.1     20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8


Mod  MAC-Address(es)                         Serial-Num
---  --------------------------------------  ----------
1    c0-8c-60-65-82-dc to c0-8c-60-65-82-df  JAF1736ALLM

输入 1：https://i.stack.imgur.com/EGsY4.jpg

输入 2：https://i.stack.imgur.com/aDGcB.jpg

Answer 1

我有一个解决方案，但不是很好。假设您的整个输出在 text.

中

import re
lines = text.split("\n")
keylines = [line for i, line in enumerate(lines) if len(lines)>(i+1) and "---" in lines[i+1]]
vallines = [line for i, line in enumerate(lines) if i!=0 and "---" in lines[i-1]]
keys = re.split("  +", "  ".join(keylines))
vals = re.split("  +", "  ".join(vallines))
result = dict(zip(keys, vals))

输出：

{
  "Mod": "1",
  "Ports": "48",
  "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3",
  "Model": "DS-C9148-K9-SUP",
  "Status": "active *",
  "Sw": "6.2(17)",
  "Hw": "1.1",
  "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8",
  "MAC-Address(es)": "c0-8c-60-65-82-dc to c0-8c-60-65-82-df",
  "Serial-Num": "JAF1736ALLM"
}

它做出了以下假设，当它们不正确时就会崩溃：

没有任何值连续包含一个 space。
"fields"之间至少有两个space。
在虚线的那一行中，至少有一段是 3 个虚线长。

Answer 2

您可以使用“---”分隔符来定义每个键和值行的切片以构建每个键值。（根据您的示例，我猜有多个 "Mod" 具有唯一的 Mod 值，因此我将此字段用于整个累加器键。）

from collections import defaultdict
import re
from itertools import groupby

sample = """\
Mod  Ports  Module-Type                         Model              Status
---  -----  ----------------------------------- ------------------ ----------
1    48     1/2/4/8 Gbps FC/Supervisor-3        DS-C9148-K9-SUP    active *
2    48     1/2/4/8 Gbps FC/Supervisor-3        DS-C9148-K9-SUP    active *

Mod  Sw              Hw      World-Wide-Name(s) (WWN)
---  --------------  ------  --------------------------------------------------
1    6.2(17)         1.1     20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8
2    6.2(17)         1.1     20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8

Mod  MAC-Address(es)                         Serial-Num
---  --------------------------------------  ----------
1    c0-8c-60-65-82-dc to c0-8c-60-65-82-df  JAF1736ALLM
2    c0-8c-60-65-82-ec to c0-8c-60-65-82-ef  JAF1736AXXX

Xbar Ports Module-Type Model Status
---- ----- ----------- ----- ------
1    0     Fabric 1    ABC   ok

Xbar Sw Hw
---- -- ---
1    NA 1.0

"""

all_input_lines = sample.splitlines()
mod_accum = defaultdict(dict)
xbar_accum = defaultdict(dict)

for is_blank, input_lines_iter in groupby(all_input_lines, 
                                          key=lambda s: not bool(s.strip())):
    input_lines = list(input_lines_iter)
    if is_blank:
        continue

    # assume first two lines are field names and separator dashes
    names, dashes = input_lines[:2]

    # make sure dashes line is all '---' separators
    if not all(ss == set('-') for ss in map(set, dashes.split())):
        print("invalid line group found, skipping...")
        print('-'*40)
        print('\n'.join(input_lines))
        print('-'*40)
        continue

    # use regex to get start/end of each '---' divider, and make slices
    spans = (match.span() for match in re.finditer('-+', dashes))
    slices = [slice(sp[0], sp[1]+1) for sp in spans]

    names = [names[sl].rstrip() for sl in slices]

    # is this a module or an xbar?
    if 'Mod' in names:
        key = 'Mod'
        accum = mod_accum
    elif 'Xbar' in names:
        key = 'Xbar'
        accum = xbar_accum
    else:
        raise ValueError("no Mod or Xbar name in row names ({})".format(
                            ",".join(names)))

    for line in input_lines:
        # use slices to extract data from values, make into a dict
        row_dict = dict(zip(names, (line[sl].rstrip() for sl in slices)))

        # accumulate these values into any previous ones collected for this Mod
        accum[row_dict[key]].update(row_dict)

# print out what we got
import json
all_data = {"Modules": mod_accum, "Xbars": xbar_accum}
print(json.dumps(all_data, indent=2))

打印：

{
  "Modules": {
    "2": {
      "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8",
      "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3",
      "Ports": "48",
      "Sw": "6.2(17)",
      "Hw": "1.1",
      "Model": "DS-C9148-K9-SUP",
      "Status": "active *",
      "Serial-Num": "JAF1736AXXX",
      "MAC-Address(es)": "c0-8c-60-65-82-ec to c0-8c-60-65-82-ef",
      "Mod": "2"
    },
    "1": {
      "World-Wide-Name(s) (WWN)": "20:01:54:7f:ee:df:88:f8 to 20:30:54:7f:ee:df:88:f8",
      "Module-Type": "1/2/4/8 Gbps FC/Supervisor-3",
      "Ports": "48",
      "Sw": "6.2(17)",
      "Hw": "1.1",
      "Model": "DS-C9148-K9-SUP",
      "Status": "active *",
      "Serial-Num": "JAF1736ALLM",
      "MAC-Address(es)": "c0-8c-60-65-82-dc to c0-8c-60-65-82-df",
      "Mod": "1"
    }
  },
  "Xbars": {
    "1": {
      "Module-Type": "Fabric 1",
      "Ports": "0",
      "Sw": "NA",
      "Hw": "1.0",
      "Model": "ABC",
      "Status": "ok",
      "Xbar": "1"
    }
  }
}

在 Python 中将表格 CLI 输出转换为 JSON 格式

Convert Tabular CLI output to JSON format in Python

python

parsing

json

tabular