JSON 到 Markdown table 格式

Question

我正在尝试构建一个函数，将 JSON 数据转换为列表，然后用作构建降价 tables 的基础。

我有第一个原型：

#!/usr/bin/env python3
import json

data = {
  "statistics": {
    "map": [
      {
        "map_name": "Location1",
        "nan": "loc1",
        "dont": "ignore this",
        "packets": "878607764338"
      },
      {
        "map_name": "Location2",
        "nan": "loc2",
        "dont": "ignore this",
        "packets": "67989088698"
      },
    ],
    "map-reset-time": "Thu Jan  6 05:59:47 2022\n"
  }
}
headers = ['Name', 'NaN', 'Packages']

def jsonToList(data):
    """adds the desired json fields"""
    # Wil be re-written to be more acceptant to different data fields. 
    json_obj = data

    ips = []
    for piece in json_obj['statistics']['map']:
        this_ip = [piece['map_name'], piece['nan'], piece['packets']]
        ips.append(this_ip)

    return ips 

def markdownTable(data, headers):
  # Find maximal length of all elements in list
    n = max(len(x) for l in data for x in l)
    # Print the rows
    headerLength = len(headers)
  
    # expected "|        Name|         NaN|    Packages|"
    for i in range(len(headers)):
      # Takes the max number of characters and subtracts the length of the header word
      hn = n - len(headers[i])
      # Prints | [space based on row above][header word]
      print("|" + " " * hn + f"{headers[i]}", end='')
      # If last run is meet add ending pipe
      if i == headerLength-1:
        print("|") # End pipe for headers

        # expected |--------|--------|--------|
        print("|", end='') # Start pipe for sep row
        for i in   range(len(headers)):
          print ("-" *n + "|", end='')

        # seams to be adding an extra line however if its not there,
        # Location1 
        print("\n", end='') 
        
    dataLength = len(data)
    for row in data:
      for x in row:
        hn = n - len(x)
        print(f"|" + " " * hn + x, end='')
      print("|")
 

if __name__ == "__main__":
    da = jsonToList(data)
    markdownTable(da, headers)

此代码按预期输出 table 可用作 markdown。

|        Name|         NaN|    Packages|
|------------|------------|------------|
|   Location1|        loc1|878607764338|
|   Location2|        loc2| 67989088698|

我想知道是否有人对我目前正在使用 n = max(len(x) for l in data for x in l) 然后减去当前字符串的长度并将其放在末尾的单词（集中）的位置有什么好的想法输出，这对左对齐很有效，但如果想让它们居中，那就有问题了。

此外，如果有人在这是我的第一次尝试或直接从 JSON.

的方法之前构建了类似的功能，则非常感谢关于优化代码方法的一般反馈。

Answer 1

如果您可以自由使用 pandas，这非常简单。降价功能很容易获得。请参阅下面的示例。

import pandas
df = pandas.DataFrame.from_dict(data['statistics']['map']).rename(columns={'map_name':'Name', 'nan':'NaN', 'packets':'Packages'})
df.drop(['dont'], axis=1, inplace=True)
print(df.to_markdown(index=False,tablefmt='fancy_grid'))

这将提供如下输出：

╒═══════════╤═══════╤══════════════╕
│ Name      │ NaN   │     Packages │
╞═══════════╪═══════╪══════════════╡
│ Location1 │ loc1  │ 878607764338 │
├───────────┼───────┼──────────────┤
│ Location2 │ loc2  │  67989088698 │
╘═══════════╧═══════╧══════════════╛

您可以使用 tablefmt 参数来应用不同的样式，例如 psql, pipe 等

Answer 2

如果您希望在 Python 中对齐文本，可以使用一些 built-in 方法。

ljust、center 和 rjust

可以在 str 实例和 return 字符串上调用 ljust, center, and rjust 方法，该字符串使用给定的填充字符（space 填充到给定的长度默认）。

>>> s = 'foo'
>>> s.ljust(10)
'foo       '
>>> s.center(10)
'   foo    '
>>> s.rjust(10)
'       foo'
>>> # Use a different fill character
>>> s.center(11, '*')
'****foo****'

格式化字符串语法

或者，您可以使用 Format String Syntax（下面使用 f-strings 进行了演示）。如果您需要将填充的字符串与其他文本组合在一起，这些内容特别有用，因为可以包含在同一字符串中。

>>> f'{s:<10}'
'foo       '
>>> f'{s:^10}'
'   foo    '
>>> f'{s:>10}'
'       foo'
>>> # Pass in a length
>>> length = 11
>>> f'{s:^{length}}'
'    foo    '
>>> # Specify a fill character
>>> f'{s:*^11}'
'****foo****'

Answer 3

general feedback on ways to optimize the code is much appreciated

我可能会这样做：

data = {
  "statistics": {
    "map": [
      {
        "map_name": "Location1",
        "nan": "loc1",
        "dont": "ignore this",
        "packets": "878607764338"
      },
      {
        "map_name": "Location2",
        "nan": "loc2",
        "dont": "ignore this",
        "packets": "67989088698"
      },
    ],
    "map-reset-time": "Thu Jan  6 05:59:47 2022\n"
  }
}

header_map = {
    'Name': 'map_name',
    'NaN': 'nan',
    'Packages': 'packets'
}

def markdownTable(data):
    rows = []
    length = max(len(v) for d in data['statistics']['map'] for k, v in d.items() if k in header_map.values()) + 2
    # build header
    rows.append('|'.join(s.center(length) for s in header_map.keys()))
    rows.append('|'.join('-' * length for x in range(len(header_map))))
    # build body
    for item in data['statistics']['map']:
        rows.append('|'.join(v.center(length) for k, v in item.items() if k in header_map.values()))
    # Print rows
    for row in rows:
        print(f'|{row}|')

if __name__ == "__main__":
    markdownTable(data)

请注意，无需使用 jsonToList 函数重组数据。只需遍历现有数据结构即可。此外，我创建了一个 header_map 字典，它将 table headers 映射到源数据中的键。在代码中，只需根据需要调用 header_map.keys() 或 header_map.values()。

作为一般运行，我尽量避免使用 + 进行字符串组合，除非绝对必要。因此，我使用'|'.join()。当然，每次调用 join 都会传递该行的列表理解。

最后，我首先构建了一个 collection of rows，然后遍历它们并打印每一行（添加开始和结束管道）。我本可以打印第一轮，但是在 f-string 中包含整个连接列表理解不是很可读。例如：

print(f"|{'|'.join(s.center(length) for s in header_map.keys())}|")

此外，在第二个循环中，我只需要定义一次打开和关闭管道的格式，这更干（不要重复自己）。回想起来，出于同样的原因，我也可以在第二次迭代中完成 join。这样做的另一个好处是让行保存原始数据，如果需要，可以对其进行额外的处理。

输出结果如下：

|     Name     |     NaN      |   Packages   |
|--------------|--------------|--------------|
|  Location1   |     loc1     | 878607764338 |
|  Location2   |     loc2     | 67989088698  |

请注意，在 center 的情况下，奇数个字符会导致右偏一个字符。因此，Name 列和 Packages 列中的最后一个值似乎偏离中心半个字符。可能，数字应该是 right-justified，这会使代码复杂化。

JSON 到 Markdown table 格式

JSON to Markdown table formatting

python

markdown

json

ljust、center 和 rjust

格式化字符串语法