如何在 Clojure 中调用分页 REST API?

How to call a paginated REST API in Clojure?

我正在尝试将一些有效的 Ruby 代码转换为调用分页 REST API 并累积数据的 Clojure。 Ruby 代码,基本上最初调用 API,检查是否有 pagination.hasNextPage 键,并使用 pagination.endCursor 作为下一个 API 的查询字符串参数在 while 循环中完成的调用。这是 简化的 Ruby 代码(删除了 logging/error 处理代码等):

def request_paginated_data(url)
  results = []

  response = # ... http get url
  response_data = response['data']

  results << response_data

  while !response_data.nil? && response.has_key?('pagination') && response['pagination'] && response['pagination'].has_key?('hasNextPage') && response['pagination']['hasNextPage'] && response['pagination'].has_key?('endCursor') && response['pagination']['endCursor']
    response = # ... http get url + response['pagination']['endCursor']
    response_data = response['data']

    results << response_data
  end

  results

end

这是我的 Clojure 代码的开头:

(defn get-paginated-data [url options]
  {:pre [(some? url) (some? options)]}
  (let [body (:body @(client/get url options))]
    (log/debug (str "body size =" (count body)))
    (let [json (json/read-str body :key-fn keyword)]
      (log/debug (str "json =" json))))
      ;; ???
      )

我知道我可以使用 contains? 在 json clojure.lang.PersistentArrayMap 中查找密钥,但是,我不确定如何编写其余代码...

你可能想要这样的东西:

(let [data    (json/read-str body :key-fn keyword)
      hnp     (get-in data [:pagination :hasNextPage])
      ec      (get-in data [:pagination :endCursor])
      continue? (and hnp ec)  ]
  (println :hnp hnp)
  (println :ec ec)
  (println :cont continue?)

...)

提取嵌套位并打印一些调试信息。仔细检查 json-to-clojure 转换是否按预期获得了“CamelCase”关键字,并在必要时进行修改以匹配。


您可能会发现 my favorite template project 很有帮助,尤其是末尾的文档列表。请务必阅读 Clojure 备忘单!

过去,我曾使用 looprecur 来处理这些事情。

这里有一个查询 Jira 的例子API:

(defn get-jira-data [from to url hdrs]
  (loop [result     []
         n          0
         api-offset 0]
    (println n " " (count result) " " api-offset)
    (let [body (jql-body from to api-offset)
          resp (client/post url
                            {:headers hdrs
                             :body body})
          issues (-> resp
                     :body
                     (json/read-str :key-fn keyword)
                     :issues)
          returned-count (count issues)

          intermediate-res (into result issues)]
      (if (and (pos? returned-count)
               (< (inc n) MAX-PAGED-PAGES))
        (recur intermediate-res
               (inc n)
               (+ api-offset returned-count))
        intermediate-res)))))

我可以建议将递归限制为最大页数,以避免在生产中出现无法预料和不愉快的意外。使用 Jira API,您可以在请求的 body 中发送下一次迭代所需的偏移量或页面。例如,如果您使用 GitHub API,您需要在 loop 调用中对 URL.

进行本地绑定

谈论 GitHub API:他们在响应中将相关的 URL 作为 HTTP headers 发送。您可以像这样使用它们:

 (loop [result []
           u url
           n 0]
      (log/debugf "Get JSON paged (%s previous results) from %s"
                  (count result) u)
      (let [resp (http-get-with-retry u {:headers auth-hdr})
            data (-> resp :body
                     (json/read-str :key-fn keyword))
            intermediate-res (into result data)
            next-url (-> resp :links :next :href)]

        (if (and next-url
                 data
                 (pos? (count data))
                 (<= n MAX-PAGED-PAGES))
          (recur intermediate-res next-url (inc n))
          intermediate-res))

您需要在此处推断缺失的函数和其他变量。 http-get-with-retry 本质上只是一个添加了重试处理函数的 HTTP GET。该模式与您所看到的相同,它仅使用响应中的相应链接和本地 url 绑定。

除 Whosebug 上的标准许可外,我特此将上述所有代码置于 Apache 软件许可 2.0 下

这是应用 Stefan Kamphausen and Alan Thompson 的建议后的最终结果:

(defn get-paginated-data [^String url ^clojure.lang.PersistentArrayMap options ^clojure.lang.Keyword data-key]
  {:pre [(some? url) (some? options)]}
  (loop [results [] u url page 1]
    (log/debugf "Requesting data from API url=%s page=%d" u page)
    (let [body                (:body @(client/get u options))
          body-map            (json/read-str body :key-fn keyword)
          data                (get-in body-map [data-key])
          has-next-page       (get-in body-map [:pagination :hasNextPage])
          end-cursor          (get-in body-map [:pagination :endCursor])
          accumulated-results (into results data)
          continue?           (and has-next-page (> (count end-cursor) 0))]
      (log/debugf "count body=%d" (count body))
      (log/debugf "count results=%s" (count results))
      (log/debugf "has-next-page=%s" has-next-page)
      (log/debugf "end-cursor=%s" end-cursor)
      (log/debugf "continue?=%s" continue?)
      (if continue?
        (let [next-url (str url "?after=" end-cursor)]
          (log/info (str "Sleeping for " (/ pagination-delay 1000) " seconds..."))
          (Thread/sleep pagination-delay)
          (recur accumulated-results next-url (inc page)))
        accumulated-results))))