正在解析 Python 个字典中的 javaScript 个数组

Parsing javaScript arrays in the Python dictionaries

所以我有一个 public 网页,其中包含类似以下代码的内容:

var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);

我想做的是让Python读取这个页面并将数据转换成2个以名称+描述为关键字的字典。

dict1["Name1Description1"] = 1.000

dict2["Name1Description1"] = 2.000

dict1["Name2Description2"] = 4.000

dict2["Name2Description2"] = 8.000

有没有一种简单的方法可以做到这一点,或者我们几乎必须像解析任何其他字符串一样解析它?显然数组可以是任意长度。

谢谢!

是的,这可以使用正则表达式。

import re

st = '''
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
'''

dict1, dict2 = {}, {}
matches = re.findall('\"(\d+)\",\"(.*?)\",\"(.*?)\",(\d+.\d+),(\d+.\d+)', st, re.DOTALL)
for m in matches:
    key = m[1] + m[2]
    dict1[key] = float(m[3])
    dict2[key] = float(m[4])

print(dict1)
print(dict2)

# {'Name1description1': 1.0, 'Name2description2': 4.0}
# {'Name1description1': 2.0, 'Name2description2': 8.0}

正则表达式的逻辑是:

\" - Match a double quote
\"(\d+)\" - Match any number of digits contained in between two double quotes
\"(.*?)\" - Match any number of any characters contained between two double quotes
(\d+.\d+) - Match any number of numbers with a dot followed by any number of numbers
, - Match a comma

因此正则表达式将匹配具有此预期模式的 js 字符串输入。但我假设 js 的逗号之间没有空格。您可以先去掉逗号,然后 运行 它。