将 CSV 转换为嵌套的 JSON,同时将特定键的值格式化为 numeric/int/float
Convert CSV to a nested JSON while formatting values for specific keys to numeric/int/float
我正在尝试将 CSV 文件转换为嵌套 JSON,这是我的 CSV,第一行作为列。
CLID,District, attribute,value
C001,Tebuslik, Name,Philip
C001,Tebuslik,Age,34
C002,Hontenlo,Name,Jane
C002,Hontenlo,Age,23
我想要的输出是一个嵌套的 json,其中 Age 键的值是数字而不是字符串。
[
{
"CLID": "C001",
"District": "Tebuslik",
"attributes": [
{
"attribute": "Name",
"value": "Philip"
},
{
"attribute": "Age",
"value": 34
}
]
},
{
"CLID": "C002",
"District": "Hontenlo",
"attributes": [
{
"attribute": "Name",
"value": "Jane"
},
{
"attribute": "Age",
"value": 23
}
]
}
]
在我的 CSV 中,所有键共享同一列 (Attribute),值可以是字符串或数字格式,具体取决于属性。
这是我的 python 半工作脚本:
from csv import DictReader
from itertools import groupby
from pprint import pprint
import json
with open('teis.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['CLID'], r['District'])):
groups.append({
"CLID": k[0],
"District": k[1],
"attributes": [{k:v for k, v in d.items() if k not in ['CLID','District']} for d in list(g)]
})
uniquekeys.append(k)
print(json.dumps(groups, indent = 4) + '\n}')
但是,下面是我使用引用数字年龄值得到的输出;
[
{
"CLID": "C001",
"District": "Tebuslik",
"attributes": [
{
"attribute": "Name",
"value": "Philip"
},
{
"attribute": "Age",
"value": "34"
}
]
},
{
"CLID": "C002",
"District": "Hontenlo",
"attributes": [
{
"attribute": "Name",
"value": "Jane"
},
{
"attribute": "Age",
"value": "23"
}
]
}
]
使用str.isdigit
检查字符串,然后使用int
。
例如:
from csv import DictReader
from itertools import groupby
from pprint import pprint
import json
with open(filename) as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['CLID'], r['District'])):
groups.append({
"CLID": k[0],
"District": k[1],
"attributes": [{k:int(v) if v.isdigit() else v for k, v in d.items() if k not in ['CLID','District']} for d in list(g)] #Update
})
uniquekeys.append(k)
print(json.dumps(groups, indent = 4) + '\n}')
我正在尝试将 CSV 文件转换为嵌套 JSON,这是我的 CSV,第一行作为列。
CLID,District, attribute,value
C001,Tebuslik, Name,Philip
C001,Tebuslik,Age,34
C002,Hontenlo,Name,Jane
C002,Hontenlo,Age,23
我想要的输出是一个嵌套的 json,其中 Age 键的值是数字而不是字符串。
[
{
"CLID": "C001",
"District": "Tebuslik",
"attributes": [
{
"attribute": "Name",
"value": "Philip"
},
{
"attribute": "Age",
"value": 34
}
]
},
{
"CLID": "C002",
"District": "Hontenlo",
"attributes": [
{
"attribute": "Name",
"value": "Jane"
},
{
"attribute": "Age",
"value": 23
}
]
}
]
在我的 CSV 中,所有键共享同一列 (Attribute),值可以是字符串或数字格式,具体取决于属性。
这是我的 python 半工作脚本:
from csv import DictReader
from itertools import groupby
from pprint import pprint
import json
with open('teis.csv') as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['CLID'], r['District'])):
groups.append({
"CLID": k[0],
"District": k[1],
"attributes": [{k:v for k, v in d.items() if k not in ['CLID','District']} for d in list(g)]
})
uniquekeys.append(k)
print(json.dumps(groups, indent = 4) + '\n}')
但是,下面是我使用引用数字年龄值得到的输出;
[
{
"CLID": "C001",
"District": "Tebuslik",
"attributes": [
{
"attribute": "Name",
"value": "Philip"
},
{
"attribute": "Age",
"value": "34"
}
]
},
{
"CLID": "C002",
"District": "Hontenlo",
"attributes": [
{
"attribute": "Name",
"value": "Jane"
},
{
"attribute": "Age",
"value": "23"
}
]
}
]
使用str.isdigit
检查字符串,然后使用int
。
例如:
from csv import DictReader
from itertools import groupby
from pprint import pprint
import json
with open(filename) as csvfile:
r = DictReader(csvfile, skipinitialspace=True)
data = [dict(d) for d in r]
groups = []
uniquekeys = []
for k, g in groupby(data, lambda r: (r['CLID'], r['District'])):
groups.append({
"CLID": k[0],
"District": k[1],
"attributes": [{k:int(v) if v.isdigit() else v for k, v in d.items() if k not in ['CLID','District']} for d in list(g)] #Update
})
uniquekeys.append(k)
print(json.dumps(groups, indent = 4) + '\n}')