如何访问 Python 中数据中的数据?

How do I access data within data in Python?

我有一个 Python 这种格式的字典:

test_scr = { 
    "visited_pages" : [ { 
          "visited_page_id" : { 
              "$oid" : "57d01dd3f1a475f7307b23d9" 
          }, "url" : "google.com", 
         "page_height" : "3986", 
         "visited_on" : { 
             "$date" : 1473256915000 
          }, "visited_page_clicks" : [ { 
                "x" : "887", 
                "y" : "35", 
                "page_height" : "3986", 
                "created" : { 
                    "$date" : 1473256920000 
                 } 
            } ], 
         "total_clicks" : 1, 
         "total_time_spent_in_minutes" : "0.10", 
         "total_mouse_moves" : 0 
      }, { 
          "visited_page_id" : { 
              "$oid" : "57d01dddf1a475a6377b23d4" 
          }, "url" : "google.com", 
         "page_height" : "3088", 
         "visited_on" : { 
             "$date" : 1473256925000 
          }, "visited_page_clicks" : [ {
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088", 
                "created" : { 
                    "$date" : 1473256934000 
                 } 
             },{
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088",
                "created" : { 
                    "$date" : 1473256935000 
                 } 
             },{ 
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                  "created" : { 
                     "$date" : 1473256936000 
                 } 
             },{ 
                 "x" : "875",
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : { 
                      "$date" : 1473256936000 
                  } 
             }, {
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : {
                      "$date" : 1473256937000 
                  } 
             },{ 
                 "x" : "1347",
                 "y" : "445", 
                 "page_height" : "3088", 
                 "created" : { 
                      "$date" : 1473256942000 
                  } 
             },{ 
                  "x" : "259", 
                  "y" : "798", 
                  "page_height" : "3018", 
                  "created" : { 
                       "$date" : 1473257244000 
                  } 
             },{ 
                  "x" : "400", 
                  "y" : "98", 
                  "page_height" : "3088",
                  "created" : { 
                       "$date" : 1473257785000 
                  } 
             }],"total_clicks" : 8, 
                "total_time_spent_in_minutes" : "14.26", 
                "total_mouse_moves" : 0 
         }, { 
            "visited_page_id" : { 
                    "$oid" : "57d0213ff1a475a6377b23d5" 
            },"url" : "google.com",
            "page_height" : "3088",
            "visited_on" : { 
                    "$date" : 1473257791000 
            },"visited_page_clicks" : [ { 
                  "x" : "805", 
                  "y" : "425", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257826000 
                  } 
              }, {
                  "x" : "523", 
                  "y" : "100", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257833000 
                  } 
            } ], "total_clicks" : 2, 
            "total_time_spent_in_minutes" : "0.47", 
            "total_mouse_moves" : 0 
        } 
    }

我必须从这里的字典中仅提取 X 和 Y 值,并将它们以矩阵形式存储在数据框中。 输出应该是这样的:

X       Y
887     35
888     381
888     381
875     364
.        .
.        .
.        .

我该怎么做?

你的字典在这个 post 中的格式很糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取 x 和 y 值。
您可以使用 dictionary["key"] 语法访问字典值。它将 return 为该键存储的值或对象。

# Two lists to store the x and y values in    
x = []
y = []

# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]

# Loop through all the pages
for page in visited_pages:
    page_clicks = page["visited_page_clicks"]
    # Loop through all the clicks for the page
    for click in page_clicks:
        # Add the x and y values to the lists
        x.append(click["x"])
        y.append(click["y"])

您可以使用列表理解来做到这一点

coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]

您可以使用多种技术将其转换为 data-frame 或 re-shape 您想要的格式。

此外,请正确格式化您的代码

输出

[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]