Google 表格：使用 ImportXML 从网站导入数字

Question

我没有编码经验！

我无法将网站上的数据抓取到我的 Google 电子表格中。我想将观察编号输入我的电子表格 this page

我试过了，但老实说我不知道我在做什么：

=IMPORTXML(A3,"//*[@id="obsstatcol"]/div/div[1]")

A3 是上面的页面 URL，其余部分是我从观察值中找到的一些教程的混搭，我试图从页面上刮掉观察值。

任何人都可以帮助我理解我到底想做什么并提供一些建议吗？

提前致谢

Answer 1

很好的尝试！但是，不幸的是，直到页面加载后才确定观测值。这意味着您的公式：

=IMPORTXML(A3,"//*[@id=""obsstatcol""]/div/div[1]")

产量

{{ shared.numberWithCommas( totalObservations ) }}

所以在这种情况下你不能只使用 ImportXML()。

然而，并非一无所有。我用 F12 打开网络监视器，看到页面正在向这个 url:

发出网络请求

https://api.inaturalist.org/v1/observations/observers?verifiable=any&quality_grade=needs_id&user_id=ericthuranira&locale=en-US

获取观测数据，似乎是JSON格式。例如。（为了便于阅读而格式化）

{
  "total_results": 1,
  "page": 1,
  "per_page": 500,
  "results": [
    {
      "user_id": 1265521,
      "observation_count": 121,
      "species_count": 42,
      "user": {
        "id": 1265521,
        "login": "ericthuranira",
        "spam": false,
        "suspended": false,
        "created_at": "2018-10-09T11:43:22+00:00",
        "login_autocomplete": "ericthuranira",
        "login_exact": "ericthuranira",
        "name": "Eric Thuranira",
        "name_autocomplete": "Eric Thuranira",
        "orcid": null,
        "icon": "https://static.inaturalist.org/attachments/users/icons/1265521/thumb.jpeg?1580369132",
        "observations_count": 237,
        "identifications_count": 203,
        "journal_posts_count": 0,
        "activity_count": 440,
        "species_count": 150,
        "universal_search_rank": 237,
        "roles": [],
        "site_id": 1,
        "icon_url": "https://static.inaturalist.org/attachments/users/icons/1265521/medium.jpeg?1580369132"
      }
    }
  ]
}

这不是 XML 格式，因此您必须使用 JSON 解析器来执行此操作。幸运的是，有人为 Google 张制作了一张！您可以通过执行以下操作轻松地自己获得它：

将 here 中的代码粘贴到您的脚本编辑器中（工具 > 脚本编辑器），并将其另存为 ImportJSON。这为您提供了 JSON 解析器。
取我上面为观察者提到的“api”URL，用这个公式（假设URL在A3）
```
=ImportJSON(A3,"/results/observation_count","noHeaders")
```

这将为您提供所需的号码。

Google 表格：使用 ImportXML 从网站导入数字

Google Sheets: Importing numbers from website using ImportXML

web-scraping

google-sheets-importxml