如何使用 Python 将 CSV 转换为 JSON,其中某些 JSON 键具有多个值

How to Convert CSV to JSON using Python where some JSON keys have multiple values

我正在尝试将 csv 转换为 json 但无法获得预期的结果

Pyton 代码:

import csv
import json
  
def csv_to_json(csvFile, jsonFile):    
    jsondict = {}  
    with open(csvFile, encoding='utf_8_sig') as csvfile:  
        csv_data = csv.DictReader(csvfile)
        jsondict["data"]=[]
        for rows in csv_data:  
            print(rows)
            jsondict["data"].append(rows)
  
    with open(jsonFile, 'w') as jsonfile:  
        jsonfile.write(json.dumps(jsondict, indent = 4))
csvfile = 'sample.csv'  
jsonfile = 'sampleoutput.json'  
  
csv_to_json(csvfile, jsonfile)

示例 CSV 数据:

Keys Value
100 value100-1
200 value200-1
200 value200-2
300 value300-1
300 value300-2
400 value400-1
400 value400-2
400 value400-3

预期结果:

    [
      {
        "Keys": "100",
        "Value": "value100-1"
        },
        {
        "Keys": "200",
        "Value": ["value200-1", "value200-2"]
      },
      {
        "Keys": "300",
        "Value": ["value300-1", "value300-2"]
        },
        {
        "Keys": "400",
        "Value": ["value400-1", "value400-2", "value400-3"]
       }
    ]

当前结果:

{
    "data": [{
            "Keys": "100",
            "Value": "value100-1"
        },
        {
            "Keys": "200",
            "Value": "value200-1"
        },
        {
            "Keys": "200",
            "Value": "value200-2"
        },
        {
            "Keys": "300",
            "Value": "value300-1"
        },
        {
            "Keys": "300",
            "Value": "value300-2"
        },
        {
            "Keys": "400",
            "Value": "value400-1"
        },
        {
            "Keys": "400",
            "Value": "value400-2"
        },
        {
            "Keys": "400",
            "Value": "value400-3"
        }
    ]
}

TIA 为您提供诚实和即时的回复:)

上述预期结果需要在 csv 文件中对数据进行分组 请改用下面的代码

import json
import io
import pandas as pd

raw_string = open("sample.csv", "r")
raw_string = ''.join([i for i in raw_string])  
raw_data = [line.split(',') for line in raw_string.split()]

df = pd.read_csv(io.StringIO(raw_string))

for item in list(df.groupby(by=[i for i in df.columns if not i.endswith("Value")])):
    df_temp = item[1]
    # messy line that matches columns to their values. The list(set(x) makes it so values are unique but also json serializable
    a = {k : (list(set(v)) if len(set(v)) > 1 else list(set(v))[0]) for k, v in df_temp.to_dict("list").items()}
    print(json.dumps(a, indent=4))