如何在 Clojure 中调用分页 REST API?
How to call a paginated REST API in Clojure?
我正在尝试将一些有效的 Ruby 代码转换为调用分页 REST API 并累积数据的 Clojure。 Ruby 代码,基本上最初调用 API,检查是否有 pagination.hasNextPage
键,并使用 pagination.endCursor
作为下一个 API 的查询字符串参数在 while
循环中完成的调用。这是 简化的 Ruby 代码(删除了 logging/error 处理代码等):
def request_paginated_data(url)
results = []
response = # ... http get url
response_data = response['data']
results << response_data
while !response_data.nil? && response.has_key?('pagination') && response['pagination'] && response['pagination'].has_key?('hasNextPage') && response['pagination']['hasNextPage'] && response['pagination'].has_key?('endCursor') && response['pagination']['endCursor']
response = # ... http get url + response['pagination']['endCursor']
response_data = response['data']
results << response_data
end
results
end
这是我的 Clojure 代码的开头:
(defn get-paginated-data [url options]
{:pre [(some? url) (some? options)]}
(let [body (:body @(client/get url options))]
(log/debug (str "body size =" (count body)))
(let [json (json/read-str body :key-fn keyword)]
(log/debug (str "json =" json))))
;; ???
)
我知道我可以使用 contains?
在 json clojure.lang.PersistentArrayMap
中查找密钥,但是,我不确定如何编写其余代码...
你可能想要这样的东西:
(let [data (json/read-str body :key-fn keyword)
hnp (get-in data [:pagination :hasNextPage])
ec (get-in data [:pagination :endCursor])
continue? (and hnp ec) ]
(println :hnp hnp)
(println :ec ec)
(println :cont continue?)
...)
提取嵌套位并打印一些调试信息。仔细检查 json-to-clojure 转换是否按预期获得了“CamelCase”关键字,并在必要时进行修改以匹配。
您可能会发现 my favorite template project 很有帮助,尤其是末尾的文档列表。请务必阅读 Clojure 备忘单!
过去,我曾使用 loop
和 recur
来处理这些事情。
这里有一个查询 Jira 的例子API:
(defn get-jira-data [from to url hdrs]
(loop [result []
n 0
api-offset 0]
(println n " " (count result) " " api-offset)
(let [body (jql-body from to api-offset)
resp (client/post url
{:headers hdrs
:body body})
issues (-> resp
:body
(json/read-str :key-fn keyword)
:issues)
returned-count (count issues)
intermediate-res (into result issues)]
(if (and (pos? returned-count)
(< (inc n) MAX-PAGED-PAGES))
(recur intermediate-res
(inc n)
(+ api-offset returned-count))
intermediate-res)))))
我可以建议将递归限制为最大页数,以避免在生产中出现无法预料和不愉快的意外。使用 Jira API,您可以在请求的 body 中发送下一次迭代所需的偏移量或页面。例如,如果您使用 GitHub API,您需要在 loop
调用中对 URL.
进行本地绑定
谈论 GitHub API:他们在响应中将相关的 URL 作为 HTTP headers 发送。您可以像这样使用它们:
(loop [result []
u url
n 0]
(log/debugf "Get JSON paged (%s previous results) from %s"
(count result) u)
(let [resp (http-get-with-retry u {:headers auth-hdr})
data (-> resp :body
(json/read-str :key-fn keyword))
intermediate-res (into result data)
next-url (-> resp :links :next :href)]
(if (and next-url
data
(pos? (count data))
(<= n MAX-PAGED-PAGES))
(recur intermediate-res next-url (inc n))
intermediate-res))
您需要在此处推断缺失的函数和其他变量。 http-get-with-retry
本质上只是一个添加了重试处理函数的 HTTP GET。该模式与您所看到的相同,它仅使用响应中的相应链接和本地 url
绑定。
除 Whosebug 上的标准许可外,我特此将上述所有代码置于 Apache 软件许可 2.0 下
这是应用 Stefan Kamphausen and Alan Thompson 的建议后的最终结果:
(defn get-paginated-data [^String url ^clojure.lang.PersistentArrayMap options ^clojure.lang.Keyword data-key]
{:pre [(some? url) (some? options)]}
(loop [results [] u url page 1]
(log/debugf "Requesting data from API url=%s page=%d" u page)
(let [body (:body @(client/get u options))
body-map (json/read-str body :key-fn keyword)
data (get-in body-map [data-key])
has-next-page (get-in body-map [:pagination :hasNextPage])
end-cursor (get-in body-map [:pagination :endCursor])
accumulated-results (into results data)
continue? (and has-next-page (> (count end-cursor) 0))]
(log/debugf "count body=%d" (count body))
(log/debugf "count results=%s" (count results))
(log/debugf "has-next-page=%s" has-next-page)
(log/debugf "end-cursor=%s" end-cursor)
(log/debugf "continue?=%s" continue?)
(if continue?
(let [next-url (str url "?after=" end-cursor)]
(log/info (str "Sleeping for " (/ pagination-delay 1000) " seconds..."))
(Thread/sleep pagination-delay)
(recur accumulated-results next-url (inc page)))
accumulated-results))))
我正在尝试将一些有效的 Ruby 代码转换为调用分页 REST API 并累积数据的 Clojure。 Ruby 代码,基本上最初调用 API,检查是否有 pagination.hasNextPage
键,并使用 pagination.endCursor
作为下一个 API 的查询字符串参数在 while
循环中完成的调用。这是 简化的 Ruby 代码(删除了 logging/error 处理代码等):
def request_paginated_data(url)
results = []
response = # ... http get url
response_data = response['data']
results << response_data
while !response_data.nil? && response.has_key?('pagination') && response['pagination'] && response['pagination'].has_key?('hasNextPage') && response['pagination']['hasNextPage'] && response['pagination'].has_key?('endCursor') && response['pagination']['endCursor']
response = # ... http get url + response['pagination']['endCursor']
response_data = response['data']
results << response_data
end
results
end
这是我的 Clojure 代码的开头:
(defn get-paginated-data [url options]
{:pre [(some? url) (some? options)]}
(let [body (:body @(client/get url options))]
(log/debug (str "body size =" (count body)))
(let [json (json/read-str body :key-fn keyword)]
(log/debug (str "json =" json))))
;; ???
)
我知道我可以使用 contains?
在 json clojure.lang.PersistentArrayMap
中查找密钥,但是,我不确定如何编写其余代码...
你可能想要这样的东西:
(let [data (json/read-str body :key-fn keyword)
hnp (get-in data [:pagination :hasNextPage])
ec (get-in data [:pagination :endCursor])
continue? (and hnp ec) ]
(println :hnp hnp)
(println :ec ec)
(println :cont continue?)
...)
提取嵌套位并打印一些调试信息。仔细检查 json-to-clojure 转换是否按预期获得了“CamelCase”关键字,并在必要时进行修改以匹配。
您可能会发现 my favorite template project 很有帮助,尤其是末尾的文档列表。请务必阅读 Clojure 备忘单!
过去,我曾使用 loop
和 recur
来处理这些事情。
这里有一个查询 Jira 的例子API:
(defn get-jira-data [from to url hdrs]
(loop [result []
n 0
api-offset 0]
(println n " " (count result) " " api-offset)
(let [body (jql-body from to api-offset)
resp (client/post url
{:headers hdrs
:body body})
issues (-> resp
:body
(json/read-str :key-fn keyword)
:issues)
returned-count (count issues)
intermediate-res (into result issues)]
(if (and (pos? returned-count)
(< (inc n) MAX-PAGED-PAGES))
(recur intermediate-res
(inc n)
(+ api-offset returned-count))
intermediate-res)))))
我可以建议将递归限制为最大页数,以避免在生产中出现无法预料和不愉快的意外。使用 Jira API,您可以在请求的 body 中发送下一次迭代所需的偏移量或页面。例如,如果您使用 GitHub API,您需要在 loop
调用中对 URL.
谈论 GitHub API:他们在响应中将相关的 URL 作为 HTTP headers 发送。您可以像这样使用它们:
(loop [result []
u url
n 0]
(log/debugf "Get JSON paged (%s previous results) from %s"
(count result) u)
(let [resp (http-get-with-retry u {:headers auth-hdr})
data (-> resp :body
(json/read-str :key-fn keyword))
intermediate-res (into result data)
next-url (-> resp :links :next :href)]
(if (and next-url
data
(pos? (count data))
(<= n MAX-PAGED-PAGES))
(recur intermediate-res next-url (inc n))
intermediate-res))
您需要在此处推断缺失的函数和其他变量。 http-get-with-retry
本质上只是一个添加了重试处理函数的 HTTP GET。该模式与您所看到的相同,它仅使用响应中的相应链接和本地 url
绑定。
除 Whosebug 上的标准许可外,我特此将上述所有代码置于 Apache 软件许可 2.0 下
这是应用 Stefan Kamphausen and Alan Thompson 的建议后的最终结果:
(defn get-paginated-data [^String url ^clojure.lang.PersistentArrayMap options ^clojure.lang.Keyword data-key]
{:pre [(some? url) (some? options)]}
(loop [results [] u url page 1]
(log/debugf "Requesting data from API url=%s page=%d" u page)
(let [body (:body @(client/get u options))
body-map (json/read-str body :key-fn keyword)
data (get-in body-map [data-key])
has-next-page (get-in body-map [:pagination :hasNextPage])
end-cursor (get-in body-map [:pagination :endCursor])
accumulated-results (into results data)
continue? (and has-next-page (> (count end-cursor) 0))]
(log/debugf "count body=%d" (count body))
(log/debugf "count results=%s" (count results))
(log/debugf "has-next-page=%s" has-next-page)
(log/debugf "end-cursor=%s" end-cursor)
(log/debugf "continue?=%s" continue?)
(if continue?
(let [next-url (str url "?after=" end-cursor)]
(log/info (str "Sleeping for " (/ pagination-delay 1000) " seconds..."))
(Thread/sleep pagination-delay)
(recur accumulated-results next-url (inc page)))
accumulated-results))))