如何通过正则表达式作为循环变量循环遍历文件内的一行

Question

我正在尝试为 json 文件制作类似于分解函数的东西。循环应该逐行获取一个 json 文件，在每一行中我有多个值，我想从这一行中提取出来并将其与主线放在一起（如 [=22 中的横向视图或爆炸函数） =])

数据看起来像这样

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}

现在我要的是这样的SQL爆这个

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key3":589101}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key4":23095}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key5":200527}


 import io
 import sys
 import re

 i = 0
 with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
for line in g:
    x = re.search('(.*wl_timestamp":"[^"]+",)', line)
    y = re.search('("wl_key[^,]+),', line)
    for y in line:
        i = i + 1
        print (x.group(0), y.group(i),'}', file=f)

我总是收到一个错误，说我无法将 str 作为组获取，但是当我将 Regex 放在下一个 for 循环中时，它只会让我得到第一个结果，但什么都不做，或者以另一种方式只需要相同的结果，并在该行中找到一个字符时写入它。

Answer 1

不要在 json 上使用正则表达式 - 在 json 上使用 json 并操作数据结构：

import json

data_str = """{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}"""

data = json.loads(data_str)  # you can use json.load( file_handle )

print(data)

for k in (x for x in data.keys() if x.startswith("wl_key")):
    print(data["wl_timestamp"],k,data[k])

输出：

2013-01-27 16:07:02 wl_key2 103717
2013-01-27 16:07:02 wl_key3 589101
2013-01-27 16:07:02 wl_key4 23095
2013-01-27 16:07:02 wl_key5 200527
2013-01-27 16:07:02 wl_key6 60319

Answer 2

这是解决我的案例的代码

import json
import io
import sys
import re

with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
    for line in g:
        l = str(line)
        data = json.loads(l)  
        for k in (x for x in data.keys() if x.startswith("wl_key")):
             x = re.search('(.*wl_timestamp":"[^"]+",")', line)
             print(x.group(0)+str(k)+'":'+str(data[k])+'}', file=f)

如何通过正则表达式作为循环变量循环遍历文件内的一行

How to loop throuh a line inside a file by regex as loop variable

regex

for-loop

explode

python-3.x

lateral