Instagram 基本显示 API 分页

Instagram Basic Display API Pagination

有没有对使用 Instagram Basic Display 获得的媒体结果进行分页的方法API?我已阅读以下文档,但它们没有任何使用分页的示例:

我想限制响应中返回的媒体,例如媒体 1-15 用于第一次呼叫,然后获取下一组,例如16-30 在接下来的通话中。

TIA

通过使用此文档中的分页参数找到答案:https://developers.facebook.com/docs/graph-api/using-graph-api#paging

目前,基本显示 API return 默认显示最近的 20 个媒体。如果你想 return 多于或少于这个,使用下面的 url:

https://graph.instagram.com/{user-id}/media?fields={media-fields-you-want-to-return}&access_token={access-token}&limit={你想要的媒体数量-return}

要进行分页,您需要有一个 "next" 端点来调用。要尝试这一点,请将您的第一次呼叫限制在少于您拥有的媒体数量。您应该获得 3 个分页端点:

    "paging": {
              "cursors": {
                       "before": "abc",
                       "after": "def"
               },
              "next": "ghi"
    }

现在将您的下一个端点添加到上面的原始 url 中: https://graph.instagram.com/{user-id}/media?fields={media-fields-you-want-to-return}&access_token={access-token}&limit={你想要的媒体数量return}&next={下一个终点}

我无法通过 CDS 解决问题。相反,使用了一种在 returned json 格式化字符串中查找“下一个”标签的方法,并直接使用它。

就我而言,我已经为 Instagram 构建了一个存储访问框架实现,因此流程如下:

在 SAF 向我的提供商发出的“添加行”调用中,我执行了 Instagram 的初始查询:

 instagramQueryResult = queryInstagramAccount(instagramUserID, null); // Initially no "next" url

这个方法依次看起来像:

private JSONObject queryInstagramAccount(String instagramUserID, String nextPageUrl) {
    String instagramToken = InTouchUtils.getInstagramAccessToken();
    if ( instagramToken == null || DEFAULT_MEDIA_SERVICE_ACCESS_TOKEN_DEFAULT.equals(instagramToken)) {
        return null;
    }
    // Returned from Instagram
    String instagramRetval = null;
    // What we send back from this method - normalized list of media plus any pagination data.
    JSONObject returnResult = null;
    // Used to build a normalized array of media objects, flattening out "CAROUSEL_ALBUM" return types
    JSONArray dataArray = new JSONArray(), returnedArray = null;
    // Initial response from Instagram as JSON prior to normalization
    JSONObject instagramJSONResult = null;
    // Parameters for the Volley call
    HashMap<String,String> params = new HashMap<>();
    params.put(INSTAGRAM_ACCESSTOKEN_KEY, InTouchUtils.getInstagramAccessToken());

    // Build the query string
    String url = null;
    if ( nextPageUrl == null ) {
        url = INSTAGRAM_GRAPH_URI + instagramUserID + MEDIA_MEDIA_EDGE;
        String fieldsString = MEDIA_ID_KEY + "," +
                MEDIA_TYPE_KEY + "," +
                MEDIA_URL_KEY + "," +
                MEDIA_THUMBNAIL_URL_KEY + "," +
                MEDIA_UPDATED_TIME_KEY;
        params.put(MEDIA_LIMIT_KEY, Long.toString(batchSize));
        params.put(MEDIA_FIELDS_KEY, fieldsString);
    } else {
        // We've been given the fully created url to use
        url = nextPageUrl;
        params = null;
    }

    try {
        instagramRetval = InTouchUtils.callWebsiteFunction(url, params);
        instagramJSONResult = new JSONObject(instagramRetval);
        returnedArray = instagramJSONResult.getJSONArray(MEDIA_DATA_ARRAY);
        if ( returnedArray.length() == 0) {
            return null;
        }
        for ( int i = 0; i < returnedArray.length(); i++) {
            JSONObject o = returnedArray.getJSONObject(i);
            // this result could have types IMAGE, VIDEO or CAROUSEL_ALBUM. The latter type
            // needs a subsequent call to get the children info
            if (o.getString(MEDIA_TYPE_KEY).equals(MEDIA_TYPE_CAROUSEL)) {
                // Here we need to make a separate call to get the carousel detail
                String mediaID = null;
                try {
                    mediaID = o.getString(MEDIA_ID_KEY);
                    String childrenEdgeUrl = INSTAGRAM_GRAPH_URI + mediaID + MEDIA_CHILDREN_EDGE;
                    params = new HashMap<>();
                    params.put(INSTAGRAM_ACCESSTOKEN_KEY, InTouchUtils.getInstagramAccessToken());
                    String mediafieldsString = MEDIA_ID_KEY + "," +
                            MEDIA_TYPE_KEY + "," +
                            MEDIA_URL_KEY + "," +
                            MEDIA_THUMBNAIL_URL_KEY + "," +
                            MEDIA_UPDATED_TIME_KEY;
                    params.put(MEDIA_FIELDS_KEY, mediafieldsString);
                    String carouselRetval = InTouchUtils.callWebsiteFunction(childrenEdgeUrl, params);
                    JSONObject carouselJSON = new JSONObject(carouselRetval);
                    // Cycle through these entries
                    JSONArray carouselData = carouselJSON.getJSONArray(MEDIA_DATA_ARRAY);
                    if ( carouselData != null && carouselData.length() > 0) {
                        for ( int x = 0; x < carouselData.length(); x++) {
                            dataArray.put(carouselData.getJSONObject(x));
                        }
                    }

                } catch (Exception e) {
                    Timber.d("Lifecycle: Exception processing carousel entry with ID %s, message: %s", mediaID, e.getMessage());
                }
            } else {
                // Add to dataArray
                dataArray.put(o);
            }
        }

    } catch (Exception e) {
        Timber.e("Exception getting Instagram info: %s", e.getMessage());
        return null;
    } finally  {
        returnedArray = null;
        instagramRetval = null;
    }

    // See if there is pagination
    JSONObject pagingObject = null;
    try {
        pagingObject = instagramJSONResult.getJSONObject(MEDIA_PAGING_KEY);
    } catch (JSONException e) {
        // No paging returned, no problem
        pagingObject = null;
    }
    returnResult = new JSONObject();
    try {
        returnResult.put(MEDIA_DATA_ARRAY, dataArray);
        if ( pagingObject != null ) {
            returnResult.put(MEDIA_PAGING_KEY, pagingObject);
        }
    } catch (JSONException e) {
        Timber.d("Lifecycle: exception gathering instagram data: %s", e.getMessage());
        returnResult = null;
    } finally {
        instagramJSONResult = null;
    }
    return returnResult;
}

初始检查与常量 DEFAULT_MEDIA_SERVICE_ACCESS_TOKEN_DEFAULT 有关,该常量在我的 DocumentsProvider 的其他地方初始化为默认值,这意味着他们还没有输入他们的 Instagram 凭据,所以在这种情况下我放弃了。

在您看到对“InTouchUtils”的调用的地方,这是我自己的 class,它封装了一堆实用函数,例如使用 Volley 进行网络 API 调用。

从 DocumentsProvider 中的几个地方调用此方法,因此其中一个参数是我是否正在处理 nextPageUrl。如果不是(nextPageUrl 为空),我们构造默认值 URL,我在其中为给定用户调用媒体“Edge”API。此方法将限制与 Instagram 访问令牌(均在我的应用程序的首选项方面定义)和字段字符串一起放入 params 哈希表中。

请注意,如果传入 nextPageUrl,那么我将完全绕过创建此 url,而只需使用 nextPageUrl

这是来自 InTouchUtilscallWebsiteFunction 代码,它在同步模式下使用 Volley 进行网站 API 调用(整个代码示例已经在 运行一个单独的线程,我已经在我的应用程序中授予了 INTERNET 权限):

public static String callWebsiteFunction(String url, HashMap params) throws Exception {
    return callWebsiteFunction(url, params, VOLLEY_REQUEST_DEFAULT_TIMEOUT);
}

public static String callWebsiteFunction(String url, HashMap params, int timeoutInSeconds) throws Exception {
    RequestFuture<String> future = RequestFuture.newFuture();
    String newUrl = null;
    if ( params != null ) {
        newUrl = InTouchUtils.createGetRequestUrl(url, params);
    } else {
        newUrl = url;
    }
    String result = null;
    StringRequest request =
            new StringRequest(Request.Method.GET,
                    newUrl,
                    future,
                    new Response.ErrorListener() {
                        @Override
                        public void onErrorResponse(VolleyError error) {
                            Timber.e("Got VolleyError: %s", error.getMessage());
                        }
                    }) {

            };

    InTouchUtils.addToRequestQueue(request);
    try {
        // Using a blocking volley request
        // See SO: 
        try {
            result = future.get(timeoutInSeconds, TimeUnit.SECONDS);
        } catch (InterruptedException e) {
            Timber.e("Got Interrupted Exception attempting Volley request: %s", e.getMessage());
        } catch (ExecutionException e) {
            Timber.e("Got Execution Exception attempting Volley request: %s", e.getMessage());
        } catch (TimeoutException e) {
            Timber.e("Got Timeout Exception attempting Volley request: %s", e.getMessage());
        }
    } catch (Exception e) {
        Timber.e("Got General Exception");
        throw e;
    }
    return result;
}

现在我有了结果,我可以处理它了。首先要做的是将字符串转换为 JSONObject,这样我就可以开始解析它了。然后通过解析“数据”键(代码中的常量 MEDIA_DATA_ARRAY),看看我是否取回了媒体项的 JSONArray。

为了我的目的,我想做的是将我的 returned 数据规范化为完整的图像列表 and/or 视频 - 所以我必须检查是否有 [=60] =]ed 是 CAROUSEL_ALBUM 类型,如果是这样,我会再次调用以获取该 CAROUSEL 的媒体子项。

最终,我重新打包了所有媒体条目,加上从 Instagram return编辑的所有分页,然后 return 返回给调用者。

现在回到调用方,我可以检查得到的内容,看看是否有分页,特别是“下一个”url。

如果我没有,那么我会重置 SAF“加载”标志(这是一个 SAF 的东西,它会导致不确定的进度条在您的提供者工作时在文件选择器中显示或不显示获取更多条目),我就完成了。请注意,“我没有”的定义是如果“分页”元素或“下一个”元素不存在。这是因为您可能根本不会获得分页元素,或者您确实获得了分页元素但其中没有“下一个”元素。

如果我这样做,我会向 SAF 表明我正在“加载”,然后我启动一个线程(“BatchFetcher”),该线程基本上循环执行相同的调用以查询 Instagram,但传入“下一个”url只要找到一个:

            if (instagramQueryResult == null || instagramQueryResult.length() == 0) {
                // Nothing in instagram for this user
                Timber.d( "addRowstoQueryChildDocumentsCursor: I called queryInstagramAccount() but nothing was there!");
                return;
            }
            JSONArray data = null;
            try {
                data = instagramQueryResult.getJSONArray(MEDIA_DATA_ARRAY);
                if ( data.length() == 0) {
                    return;
                }
            } catch (JSONException e) {
                // No data, nothing to do
                Timber.d("Lifecycle: Found no media data for user, exception was: %s", e.getMessage());
                return;
            }
            JSONObject paging = null;
            String nextUrl = null;
            try {
                paging = instagramQueryResult.getJSONObject(MEDIA_PAGING_KEY);
                // If we get here, test to see if we have a "next" node. If so, that's what
                // we need to query, otherwise we are done.
                nextUrl = paging.getString(MEDIA_NEXT_KEY);
            } catch (JSONException e) {
                // No paging
                paging = null;
                nextUrl = null;
            }

            Timber.d( "addRowstoQueryChildDocumentsCursor: New query fetch got %d entries.", data.length());
            if ( paging == null || nextUrl == null) {
                // We are done - add these to cache and cursor and clear loading flag
                populateResultsToCacheAndCursor(data, cursor);
                clearCursorLoadingNotification(cursor);
                Timber.d( "addRowstoQueryChildDocumentsCursor: Directory retrieval is complete for parentDocumentId: " +
                        parentDocumentId +
                        " took " +
                        (System.currentTimeMillis()- startTimeForDirectoryQuery)+"ms.");

            } else {
                // Store our results to both the cache and cursor - cursor for the initial return,
                // cache for when we come back after the Thread finishes
                populateResultsToCacheAndCursor(data, cursor);
                // Set the getExtras()
                setCursorForLoadingNotification(cursor);
                // Register this cursor with the Resolver to get notified by Thread so Cursor will then notify loader to re-load
                Timber.d( "addRowstoQueryChildDocumentsCursor: registering cursor for notificationUri on: %s and starting BatchFetcher.", getChildDocumentsUri(parentDocumentId).toString());
                cursor.setNotificationUri(getContext().getContentResolver(),getChildDocumentsUri(parentDocumentId));
                // Start new thread
                batchFetcher = new BatchFetcher(parentDocumentId, nextUrl);
                batchFetcher.start();
            }

线程“batchFetcher”处理检查媒体项的 return 值并继续循环直到找不到更多条目,不再有“下一个 url”是 return从 Instagram 编辑,或直到它被打断。 它填充一个内部缓存,在 SAF 向我的提供者发出后续请求时读取该缓存,直到没有更多要获取的内容为止,在这种情况下,游标的“加载”方面将被重置,并且 SAF 将停止从我的提供者请求数据提供商。

这是我在@CDS answer 上创建的简单 python 函数。

import requests

def get_user_data2(user_id, access_token, max_limit=100):
    fields = 'caption, id, username, media_type'
    all_posts = []
    paging_url = f'https://graph.instagram.com/{user_id}/media?fields={fields}&access_token={access_token}&limit={max_limit}'

    while paging_url is not None:
        r = requests.get(paging_url)
        r = r.json()
        all_posts.extend(r['data'])
        try:
            paging_url = r['paging']['next']
        except:
            paging_url = None
    
    return all_posts