使用 Spring WebClient 重试时使用 StepVerifier 块进行测试

Question

编辑： 这里 https://github.com/wujek-srujek/reactor-retry-test 是一个包含所有代码的存储库。

我有以下 Spring WebClient 代码 POST 到远程服务器（为简洁起见没有导入的 Kotlin 代码）：

private val logger = KotlinLogging.logger {}

@Component
class Client(private val webClient: WebClient) {

    companion object {
        const val maxRetries = 2L
        val firstBackOff = Duration.ofSeconds(5L)
        val maxBackOff = Duration.ofSeconds(20L)
    }

    fun send(uri: URI, data: Data): Mono<Void> {
        return webClient
            .post()
            .uri(uri)
            .contentType(MediaType.APPLICATION_JSON)
            .bodyValue(data)
            .retrieve()
            .toBodilessEntity()
            .doOnSubscribe {
                logger.info { "Calling backend, uri: $uri" }
            }
            .retryExponentialBackoff(maxRetries, firstBackOff, maxBackOff, jitter = false) {
                logger.debug { "Call to $uri failed, will retry (#${it.iteration()} of max $maxRetries)" }
            }
            .doOnError {
                logger.error { "Call to $uri with $maxRetries retries failed with $it" }
            }
            .doOnSuccess {
                logger.info { "Call to $uri succeeded" }
            }
            .then()
    }
}

（它 returns 是一个空 Mono 因为我们不期待答案，我们也不关心它。）

我想测试 2 个案例，其中一个让我很头疼，即我想测试所有重试都已触发的案例。我们正在使用来自 reactor-test 的 MockWebServer (https://github.com/square/okhttp/tree/master/mockwebserver) 和 StepVerifier。（成功的测试很简单，不需要任何虚拟时间调度程序魔法，而且工作得很好。）这是失败的代码：

@JsonTest
@ContextConfiguration(classes = [Client::class, ClientConfiguration::class])
class ClientITest @Autowired constructor(
    private val client: Client
) {
    lateinit var server: MockWebServer

    @BeforeEach
    fun `init mock server`() {
        server = MockWebServer()
        server.start()
    }

    @AfterEach
    fun `shutdown server`() {
        server.shutdown()
    }

   @Test
   fun `server call is retried and eventually fails`() {
       val data = Data()
       val uri = server.url("/server").uri()
       val responseStatus = HttpStatus.INTERNAL_SERVER_ERROR

       repeat((0..Client.maxRetries).count()) {
           server.enqueue(MockResponse().setResponseCode(responseStatus.value()))
       }

       StepVerifier.withVirtualTime { client.send(uri, data) }
           .expectSubscription()
           .thenAwait(Duration.ofSeconds(10)) // wait for the first retry
           .expectNextCount(0)
           .thenAwait(Duration.ofSeconds(20)) // wait for the second retry
           .expectNextCount(0)
           .expectErrorMatches {
               val cause = it.cause
               it is RetryExhaustedException &&
                       cause is WebClientResponseException &&
                       cause.statusCode == responseStatus
           }
           .verify()

       // assertions
       }
   }

我正在使用 withVirtualTime 因为我不希望测试花费将近几秒钟的时间。问题是测试会无限期地阻塞。这是（简化的）日志输出：

okhttp3.mockwebserver.MockWebServer      : MockWebServer[51058] starting to accept connections
Calling backend, uri: http://localhost:51058/server
MockWebServer[51058] received request: POST /server HTTP/1.1 and responded: HTTP/1.1 500 Server Error
Call to http://localhost:51058/server failed, will retry (#1 of max 2)
Calling backend, uri: http://localhost:51058/server
MockWebServer[51058] received request: POST /server HTTP/1.1 and responded: HTTP/1.1 500 Server Error
Call to http://localhost:51058/server failed, will retry (#2 of max 2)

如您所见，第一次重试有效，但第二次重试阻塞。我不知道如何编写测试以免发生这种情况。更糟糕的是，客户端实际上会使用抖动，这会使时序难以预测。

以下使用 StepVerifier 但没有 WebClient 的测试工作正常，即使重试次数更多：

@Test
fun test() {
    StepVerifier.withVirtualTime {
        Mono
            .error<RuntimeException>(RuntimeException())
            .retryExponentialBackoff(5,
                                     Duration.ofSeconds(5),
                                     Duration.ofMinutes(2),
                                     jitter = true) {
                println("Retrying")
            }
            .then()
    }
        .expectSubscription()
        .thenAwait(Duration.ofDays(1)) // doesn't matter
        .expectNextCount(0)
        .expectError()
        .verify()
}

谁能帮我解决这个问题，最好能解释一下哪里出了问题？

Answer 1

这是虚拟时间的限制和时钟在 StepVerifier 中的操作方式。 thenAwait 方法与底层调度不同步（例如作为 retryBackoff 操作的一部分发生）。这意味着操作员在时钟已经提前一天的时间点提交重试任务。因此第二次重试安排在 + 1 day and 10 seconds，因为时钟在 +1 day。在那之后，时钟永远不会提前，因此永远不会向 MockWebServer.

发出额外的请求

你的情况变得更加复杂，因为涉及到一个额外的组件，MockWebServer，它仍然有效 "in real time"。虽然推进虚拟时钟是一个非常快速的操作，但来自 MockWebServer 的响应仍然通过一个套接字，因此对重试调度有一定的延迟，从测试编写的角度来看，这使得事情变得更加复杂。

一个可能的探索解决方案是在并行线程中外部化 VirtualTimeScheduler 的创建并将 advanceTimeBy 调用绑定到 mockServer.takeRequest()。

使用 Spring WebClient 重试时使用 StepVerifier 块进行测试

Test using StepVerifier blocks when using Spring WebClient with retry

project-reactor

spring-webclient