正在解析 Python 个字典中的 javaScript 个数组
Parsing javaScript arrays in the Python dictionaries
所以我有一个 public 网页,其中包含类似以下代码的内容:
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
我想做的是让Python读取这个页面并将数据转换成2个以名称+描述为关键字的字典。
即
dict1["Name1Description1"] = 1.000
dict2["Name1Description1"] = 2.000
dict1["Name2Description2"] = 4.000
dict2["Name2Description2"] = 8.000
有没有一种简单的方法可以做到这一点,或者我们几乎必须像解析任何其他字符串一样解析它?显然数组可以是任意长度。
谢谢!
是的,这可以使用正则表达式。
import re
st = '''
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
'''
dict1, dict2 = {}, {}
matches = re.findall('\"(\d+)\",\"(.*?)\",\"(.*?)\",(\d+.\d+),(\d+.\d+)', st, re.DOTALL)
for m in matches:
key = m[1] + m[2]
dict1[key] = float(m[3])
dict2[key] = float(m[4])
print(dict1)
print(dict2)
# {'Name1description1': 1.0, 'Name2description2': 4.0}
# {'Name1description1': 2.0, 'Name2description2': 8.0}
正则表达式的逻辑是:
\" - Match a double quote
\"(\d+)\" - Match any number of digits contained in between two double quotes
\"(.*?)\" - Match any number of any characters contained between two double quotes
(\d+.\d+) - Match any number of numbers with a dot followed by any number of numbers
, - Match a comma
因此正则表达式将匹配具有此预期模式的 js 字符串输入。但我假设 js 的逗号之间没有空格。您可以先去掉逗号,然后 运行 它。
所以我有一个 public 网页,其中包含类似以下代码的内容:
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
我想做的是让Python读取这个页面并将数据转换成2个以名称+描述为关键字的字典。
即
dict1["Name1Description1"] = 1.000
dict2["Name1Description1"] = 2.000
dict1["Name2Description2"] = 4.000
dict2["Name2Description2"] = 8.000
有没有一种简单的方法可以做到这一点,或者我们几乎必须像解析任何其他字符串一样解析它?显然数组可以是任意长度。
谢谢!
是的,这可以使用正则表达式。
import re
st = '''
var arrayA = new Array();
arrayA[0] = new customItem("1","Name1","description1",1.000,2.000);arrayA[1] = new customItem("2","Name2","description2",4.000,8.000);
'''
dict1, dict2 = {}, {}
matches = re.findall('\"(\d+)\",\"(.*?)\",\"(.*?)\",(\d+.\d+),(\d+.\d+)', st, re.DOTALL)
for m in matches:
key = m[1] + m[2]
dict1[key] = float(m[3])
dict2[key] = float(m[4])
print(dict1)
print(dict2)
# {'Name1description1': 1.0, 'Name2description2': 4.0}
# {'Name1description1': 2.0, 'Name2description2': 8.0}
正则表达式的逻辑是:
\" - Match a double quote
\"(\d+)\" - Match any number of digits contained in between two double quotes
\"(.*?)\" - Match any number of any characters contained between two double quotes
(\d+.\d+) - Match any number of numbers with a dot followed by any number of numbers
, - Match a comma
因此正则表达式将匹配具有此预期模式的 js 字符串输入。但我假设 js 的逗号之间没有空格。您可以先去掉逗号,然后 运行 它。