如何在 Selenium 中使用 Chrome DevTools 协议（使用 Python）来捕获 HTTP 请求和响应？

Question

我知道 Fetch Domain 用于此目的，但我不知道具体如何实现。在 Selenium python 中，我使用以下代码启用 requestPaused 事件的发布。

driver.execute_cdp_cmd("Fetch.enable",{})
driver.get('https://www.example.com')

但我不知道如何处理 requestPaused 事件（我需要调用一个 fulfillRequest 或 continueRequest/continueWithAuth）。结果，我的程序停止工作。如果有人能给我一个例子来帮助我理解它是如何工作的，我真的很感激。

Answer 1

是的，你没看错。

根据 Selenium v4.0.0-alpha-3 的发行说明：

* Expose devtools APIs from chromium derived drivers.
* Expose presence of devtools support on a role-based interface

根据 Selenium v4.0.0.0-alpha-1 的发行说明：

* Basic support for CDP landed via the "DevTools" interface.

所以chrome-devtools-protocol is all set to be available with selenium4 which will allow for tools to instrument, inspect, debug and profile Chromium, Chrome and other Blink-based browsers. In the discussion Controlling Chrome Devtools with Selenium Webdriver @AdiOhana mentions of the example usage of a few commands from the Profiler Domain如下：

    driver.getDevTools().createSession();
    driver.getDevTools().send(new Command("Profiler.enable", ImmutableMap.of()));
    driver.getDevTools().send(new Command("Profiler.start", ImmutableMap.of()));
    //register to profiler events
    driver.getDevTools().addListener(new Event("Profiler.consoleProfileStarted", ConsoleProfileStarted.class), new Consumer<Object>() {
        @Override
        public void accept(Object o) {
            //do something
        }
    });

Note: Until the Profiler domain will added to Selenium java client, you will have to supply your Mapper.

获取域

Fetch Domain 将使客户端能够用客户端代码替换浏览器的网络层。

Fetch Domain方法如下：
- Fetch.disable: 禁用获取域。
- Fetch.enable：允许发出 requestPaused 事件。请求将暂停，直到客户端调用 failRequest、fulfillRequest 或 continueRequest/continueWithAuth.
- Fetch.failRequest: 导致请求因指定原因而失败。
- Fetch.fulfillRequest: 提供对请求的响应。
- Fetch.continueRequest: 继续请求，可选择修改它的一些参数。
- Fetch.continueWithAuth：在 authRequired 事件之后继续提供 authChallengeResponse 的请求。
- Fetch.getResponseBody：导致从服务器接收响应主体并作为单个字符串返回。只能针对暂停在 Response 阶段且与 takeResponseBodyForInterceptionAsStream 互斥的请求发出。在收到正文之前调用影响请求的其他方法或禁用获取域会导致未定义的行为。
- Fetch.takeResponseBodyAsStream: Returns 表示响应主体的流的句柄。请求必须在 HeadersReceived 阶段暂停。请注意，在此命令之后，请求不能按原样继续——客户端需要取消它或提供响应正文。 stream只支持顺序读，指定位置IO.read会失败。此方法与 getResponseBody 互斥。在收到正文之前调用影响请求的其他方法或禁用获取域会导致未定义的行为。
Fetch Domain事件如下：
- Fetch.requestPaused：当域被启用并且请求URL匹配指定的过滤器时发出。请求暂停，直到客户端响应 continueRequest、failRequest 或 fulfillRequest 之一。请求的阶段可以通过 responseErrorReason 和 responseStatusCode 的存在来确定——如果这些字段中的任何一个存在，则请求处于响应阶段，否则处于请求阶段。
- Fetch.authRequired：在 handleAuthRequests 设置为 true 的情况下启用域时发出。请求暂停，直到客户端用 continueWithAuth 响应。

参考资料

您可以在以下位置找到一些相关的讨论：

如何在 Selenium 中使用 Chrome DevTools 协议（使用 Python）来捕获 HTTP 请求和响应？

How to use Chrome DevTools protocol in Selenium (using Python) for capturing HTTP requests and responses?

selenium

google-chrome-devtools

fetch-api

chrome-devtools-protocol

selenium4

获取域

参考资料