如何使用 Python 将特定 CSV 格式转换为 JSON
How to convert specific CSV format to JSON using Python
我已经从 Google Trends 下载了一个 CSV 文件,它以这种格式显示数据:
Top cities for golden globes
City,golden globes
New York (United States),100
Los Angeles (United States),91
Toronto (Canada),69
Top regions for golden globes
Region,golden globes
United States,100
Canada,91
Ireland,72
Australia,72
这些组中有 3-4 个由空格分隔。每组的第一行包含我想用作键的文本,后面是我需要与该键关联的字典列表。有没有人对我可以用来实现这一目标的一些 Python 工具有任何建议?我对 Python 的 CSV 库不太满意。
我想要的上述 CSV 输出如下所示:
{
"Top cities for golden globes" :
{
"New York (United States)" : 100,
"Los Angeles (United States)" : 91,
"Toronto (Canada)" : 69
},
"Top regions for golden globes" :
{
"United States" : 100,
"Canada" : 91,
"Ireland" : 72,
"Australia" : 72
}
}
您的输入格式非常符合我的预期,我会在没有 CSV 库的情况下手动输入。
import json
from collections import defaultdict
fh = open("yourfile.csv")
result = defaultdict(dict) #dictionary holding the data
current_key = "" #current category
ignore_next = False #flag to skip header
for line in fh:
line = line.strip() #throw away newline
if line == "": #line is empty
current_key = ""
continue
if current_key == "": #current_key is empty
current_key = line #so the current line is the header for the following data
ignore_next = True
continue
if ignore_next: #we're in a line that can be ignored
ignore_next = False
continue
(a,b) = line.split(",")
result[current_key][a] = b
fh.close()
#pretty-print data
print json.dumps(result, sort_keys=True, indent=4)
我会尝试类似...:[=13=]
row = []
dd = {}
with open('the.csv') as f:
r = csv.reader(f)
while True:
if row: # normal case, non-empty row
d[row[0]] = row[1]
row = next(r, None)
if row is None: break
else: # row is empty at start and after blank line
category = next(f, None)
if category is None: break
category = category.strip()
next(r) # skip headers row
d = dd[category] = {}
row = next(r, None)
if row is None: break
现在,dd
应该就是你想要的dict-of-dicts,你可以json.dump
随心所欲
我已经从 Google Trends 下载了一个 CSV 文件,它以这种格式显示数据:
Top cities for golden globes
City,golden globes
New York (United States),100
Los Angeles (United States),91
Toronto (Canada),69
Top regions for golden globes
Region,golden globes
United States,100
Canada,91
Ireland,72
Australia,72
这些组中有 3-4 个由空格分隔。每组的第一行包含我想用作键的文本,后面是我需要与该键关联的字典列表。有没有人对我可以用来实现这一目标的一些 Python 工具有任何建议?我对 Python 的 CSV 库不太满意。
我想要的上述 CSV 输出如下所示:
{
"Top cities for golden globes" :
{
"New York (United States)" : 100,
"Los Angeles (United States)" : 91,
"Toronto (Canada)" : 69
},
"Top regions for golden globes" :
{
"United States" : 100,
"Canada" : 91,
"Ireland" : 72,
"Australia" : 72
}
}
您的输入格式非常符合我的预期,我会在没有 CSV 库的情况下手动输入。
import json
from collections import defaultdict
fh = open("yourfile.csv")
result = defaultdict(dict) #dictionary holding the data
current_key = "" #current category
ignore_next = False #flag to skip header
for line in fh:
line = line.strip() #throw away newline
if line == "": #line is empty
current_key = ""
continue
if current_key == "": #current_key is empty
current_key = line #so the current line is the header for the following data
ignore_next = True
continue
if ignore_next: #we're in a line that can be ignored
ignore_next = False
continue
(a,b) = line.split(",")
result[current_key][a] = b
fh.close()
#pretty-print data
print json.dumps(result, sort_keys=True, indent=4)
我会尝试类似...:[=13=]
row = []
dd = {}
with open('the.csv') as f:
r = csv.reader(f)
while True:
if row: # normal case, non-empty row
d[row[0]] = row[1]
row = next(r, None)
if row is None: break
else: # row is empty at start and after blank line
category = next(f, None)
if category is None: break
category = category.strip()
next(r) # skip headers row
d = dd[category] = {}
row = next(r, None)
if row is None: break
现在,dd
应该就是你想要的dict-of-dicts,你可以json.dump
随心所欲