如何访问 Python 中数据中的数据?
How do I access data within data in Python?
我有一个 Python 这种格式的字典:
test_scr = {
"visited_pages" : [ {
"visited_page_id" : {
"$oid" : "57d01dd3f1a475f7307b23d9"
}, "url" : "google.com",
"page_height" : "3986",
"visited_on" : {
"$date" : 1473256915000
}, "visited_page_clicks" : [ {
"x" : "887",
"y" : "35",
"page_height" : "3986",
"created" : {
"$date" : 1473256920000
}
} ],
"total_clicks" : 1,
"total_time_spent_in_minutes" : "0.10",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d01dddf1a475a6377b23d4"
}, "url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473256925000
}, "visited_page_clicks" : [ {
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256934000
}
},{
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256935000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
}, {
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256937000
}
},{
"x" : "1347",
"y" : "445",
"page_height" : "3088",
"created" : {
"$date" : 1473256942000
}
},{
"x" : "259",
"y" : "798",
"page_height" : "3018",
"created" : {
"$date" : 1473257244000
}
},{
"x" : "400",
"y" : "98",
"page_height" : "3088",
"created" : {
"$date" : 1473257785000
}
}],"total_clicks" : 8,
"total_time_spent_in_minutes" : "14.26",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d0213ff1a475a6377b23d5"
},"url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473257791000
},"visited_page_clicks" : [ {
"x" : "805",
"y" : "425",
"page_height" : "3088",
"created" : {
"$date" : 1473257826000
}
}, {
"x" : "523",
"y" : "100",
"page_height" : "3088",
"created" : {
"$date" : 1473257833000
}
} ], "total_clicks" : 2,
"total_time_spent_in_minutes" : "0.47",
"total_mouse_moves" : 0
}
}
我必须从这里的字典中仅提取 X 和 Y 值,并将它们以矩阵形式存储在数据框中。
输出应该是这样的:
X Y
887 35
888 381
888 381
875 364
. .
. .
. .
我该怎么做?
你的字典在这个 post 中的格式很糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取 x 和 y 值。
您可以使用 dictionary["key"]
语法访问字典值。它将 return 为该键存储的值或对象。
# Two lists to store the x and y values in
x = []
y = []
# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]
# Loop through all the pages
for page in visited_pages:
page_clicks = page["visited_page_clicks"]
# Loop through all the clicks for the page
for click in page_clicks:
# Add the x and y values to the lists
x.append(click["x"])
y.append(click["y"])
您可以使用列表理解来做到这一点
coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]
您可以使用多种技术将其转换为 data-frame 或 re-shape 您想要的格式。
此外,请正确格式化您的代码
输出
[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]
我有一个 Python 这种格式的字典:
test_scr = {
"visited_pages" : [ {
"visited_page_id" : {
"$oid" : "57d01dd3f1a475f7307b23d9"
}, "url" : "google.com",
"page_height" : "3986",
"visited_on" : {
"$date" : 1473256915000
}, "visited_page_clicks" : [ {
"x" : "887",
"y" : "35",
"page_height" : "3986",
"created" : {
"$date" : 1473256920000
}
} ],
"total_clicks" : 1,
"total_time_spent_in_minutes" : "0.10",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d01dddf1a475a6377b23d4"
}, "url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473256925000
}, "visited_page_clicks" : [ {
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256934000
}
},{
"x" : "888",
"y" : "381",
"page_height" : "3088",
"created" : {
"$date" : 1473256935000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
},{
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256936000
}
}, {
"x" : "875",
"y" : "364",
"page_height" : "3088",
"created" : {
"$date" : 1473256937000
}
},{
"x" : "1347",
"y" : "445",
"page_height" : "3088",
"created" : {
"$date" : 1473256942000
}
},{
"x" : "259",
"y" : "798",
"page_height" : "3018",
"created" : {
"$date" : 1473257244000
}
},{
"x" : "400",
"y" : "98",
"page_height" : "3088",
"created" : {
"$date" : 1473257785000
}
}],"total_clicks" : 8,
"total_time_spent_in_minutes" : "14.26",
"total_mouse_moves" : 0
}, {
"visited_page_id" : {
"$oid" : "57d0213ff1a475a6377b23d5"
},"url" : "google.com",
"page_height" : "3088",
"visited_on" : {
"$date" : 1473257791000
},"visited_page_clicks" : [ {
"x" : "805",
"y" : "425",
"page_height" : "3088",
"created" : {
"$date" : 1473257826000
}
}, {
"x" : "523",
"y" : "100",
"page_height" : "3088",
"created" : {
"$date" : 1473257833000
}
} ], "total_clicks" : 2,
"total_time_spent_in_minutes" : "0.47",
"total_mouse_moves" : 0
}
}
我必须从这里的字典中仅提取 X 和 Y 值,并将它们以矩阵形式存储在数据框中。 输出应该是这样的:
X Y
887 35
888 381
888 381
875 364
. .
. .
. .
我该怎么做?
你的字典在这个 post 中的格式很糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取 x 和 y 值。
您可以使用 dictionary["key"]
语法访问字典值。它将 return 为该键存储的值或对象。
# Two lists to store the x and y values in
x = []
y = []
# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]
# Loop through all the pages
for page in visited_pages:
page_clicks = page["visited_page_clicks"]
# Loop through all the clicks for the page
for click in page_clicks:
# Add the x and y values to the lists
x.append(click["x"])
y.append(click["y"])
您可以使用列表理解来做到这一点
coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]
您可以使用多种技术将其转换为 data-frame 或 re-shape 您想要的格式。
此外,请正确格式化您的代码
输出
[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]