列表理解是合并这些 JSON 文件 Python 的正确方法吗?

Is List Comprehension the Right Way to Merge these JSON files Python?

如何使用 python 列表推导式将一个 JSON 文件中的值替换为另一个 JSON 文件中的链接值?

一个看起来像这样,有一个 "a" 值,我需要用它来替换另一个列表中的值,使用 "b" 作为连接器(a、b 和 c 值都是唯一 ID):

{
   "records":[
      {
         "a": "7hk2k989u23lesdfsfd",
         "b":"b8",
      },
      {
         "a": "9ty562349u23lesdfsfd",
         "b":"b6",
      },
      {
         "a": "Ur233Fglesdfsfd",
         "b":"b2",
      }
   ]
}

另一个看起来像这样 "d"s 需要替换为相应的 "a" 值,其中 "b" 是关键:

{
   "records":[
      {
         "c":00023414,
         "d":["b8","b6"]
      },
      {
         "c":0005814,
         "d":["b8","b2","b6"]
      }
   ]
}

所以我最终得到:

{
   "records":[
      {
         "c":00023414,
         "d":["7hk2k989u23lesdfsfd","9ty562349u23lesdfsfd"]
      },
      {
         "c":0005814,
         "d":["7hk2k989u23lesdfsfd","Ur233Fglesdfsfd","9ty562349u23lesdfsfd"]
      }
   ]
}

使用 python 解决这个问题的正确方法是什么,特别是如果我需要代码来提高性能?

您的文件无效 JSON。您应该检查 JSON 验证器,例如 JSON Lint

In [494]: import json

In [495]: with open('/Users/ado/Desktop/ab.json') as f:
     ...:     ab = json.load(f)
     ...:

In [496]: with open('/Users/ado/Desktop/cd.json') as f:
     ...:     cd = json.load(f)
     ...:

请注意,您可以将 ab 简单地视为相关 ab 的集合。这是使用字典将 bs 映射到 as

的好时机
In [497]: d_ab = {r['b']: r['a'] for r in ab['records']}

In [498]: d_ab
Out[498]:
{'b2': 'Ur233Fglesdfsfd',
 'b6': '9ty562349u23lesdfsfd',
 'b8': '7hk2k989u23lesdfsfd'}

现在您可以迭代 cd 中的 records 并使用 list 理解来创建新值

In [499]: for r in cd['records']:
     ...:     r['d'] = [d_ab.get(d) for d in r['d']]
     ...:

In [500]: cd
Out[500]:
{'records': [{'c': 23414,
   'd': ['7hk2k989u23lesdfsfd', '9ty562349u23lesdfsfd']},
  {'c': 5814,
   'd': ['7hk2k989u23lesdfsfd', 'Ur233Fglesdfsfd', '9ty562349u23lesdfsfd']}]}

最后,将新内容写入文件

In [502]: with open('/Users/ado/Desktop/cd-mapped.json', 'w') as f:
     ...:     f.write(json.dumps(cd))
     ...:

这个解决方案的前提是在ab中每条记录总有ab

PS 为了好玩,你可以使用 mapdict.get 而不是理解

In [505]: for r in cd['records']:
     ...:     r['d'] = list(map(d_ab.get, r['d']))
     ...:

In [506]: cd
Out[506]:
{'records': [{'c': 23414,
   'd': ['7hk2k989u23lesdfsfd', '9ty562349u23lesdfsfd']},
  {'c': 5814,
   'd': ['7hk2k989u23lesdfsfd', 'Ur233Fglesdfsfd', '9ty562349u23lesdfsfd']}]}

就性能而言,理解力通常会超过 maps

In [509]: %timeit for r in cd['records']: r['d'] = [d_ab.get(d) for d in r['d']]
     ...:
The slowest run took 7.19 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.34 µs per loop

In [511]: %timeit for r in cd['records']: r['d'] = list(map(d_ab.get, r['d']))
The slowest run took 7.19 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.74 µs per loop