Spring boot on Kubernetes does not get restarted on java.lang.OutOfMemoryError: Java heap space

Spring boot on Kubernetes does not get restarted on java.lang.OutOfMemoryError: Java heap space

我已经 运行 使用 JDK 11 图像在 Kubernetes 上 spring 引导应用程序。我的期望是,当 JVM 遇到内存不足异常时,应该终止 pod,以便 Kubernetes 可以启动更大的 pod。我可以确认这不是正在发生的事情。我不确定是否有一些我必须设置的 JVM 参数我丢失了或者一些 Kubernetes 配置来了解这种情况。

我正在使用以下 JVM 参数:

-XX:InitialRAMPercentage=20.0 -XX:MinRAMPercentage=50.0 -XX:MaxRAMPercentage=80.0 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutofMemoryError

抛出的异常:

{
  "message": "Stopping container due to an Error",
  "logger_name": "org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer",
  "thread_name": "KafkaConsumerDestination{consumerDestinationName='message-submitted', partitions=21, dlqName='dlq'}.container-0-C-1",
  "level": "ERROR",
  "stack_trace": "java.lang.OutOfMemoryError: Java heap space\n\tat java.base/java.util.Arrays.copyOf(Arrays.java:3745)\n\tat java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)\n\tat java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)\n\tat java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)\n\tat software.amazon.awssdk.utils.IoUtils.toByteArray(IoUtils.java:49)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer.lambda$toBytes(ResponseTransformer.java:175)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer$$Lambda17/0x0000000101087040.transform(Unknown Source)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.transformResponse(BaseSyncClientHandler.java:154)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.handle(BaseSyncClientHandler.java:142)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleSuccessResponse(HandleResponseStage.java:89)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleResponse(HandleResponseStage.java:70)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:58)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:41)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:64)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:36)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.doExecute(RetryableStage.java:113)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.execute(RetryableStage.java:86)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:62)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:57)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)\n"
}


{
  "message": "Error while stopping the container: ",
  "logger_name": "org.springframework.kafka.listener.KafkaMessageListenerContainer",
  "thread_name": "KafkaConsumerDestination{consumerDestinationName='message-submitted', partitions=21, dlqName='dlq'}.container-0-C-1",
  "level": "ERROR",
  "stack_trace": "java.lang.OutOfMemoryError: Java heap space\n\tat java.base/java.util.Arrays.copyOf(Arrays.java:3745)\n\tat java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)\n\tat java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)\n\tat java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)\n\tat software.amazon.awssdk.utils.IoUtils.toByteArray(IoUtils.java:49)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer.lambda$toBytes(ResponseTransformer.java:175)\n\tat software.amazon.awssdk.core.sync.ResponseTransformer$$Lambda17/0x0000000101087040.transform(Unknown Source)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.transformResponse(BaseSyncClientHandler.java:154)\n\tat software.amazon.awssdk.core.client.handler.BaseSyncClientHandler$HttpResponseHandlerAdapter.handle(BaseSyncClientHandler.java:142)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleSuccessResponse(HandleResponseStage.java:89)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.handleResponse(HandleResponseStage.java:70)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:58)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:41)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:64)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:36)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.doExecute(RetryableStage.java:113)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage$RetryExecutor.execute(RetryableStage.java:86)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:62)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:57)\n\tat software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)\n\tat software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)\n"
}

我认为发生的事情是 OOM 异常导致 pod 关闭,然后在尝试关闭 pod 时抛出相同的异常。所以我试图通过添加 -XX:OnOutOfMemoryError="kill -9 %p 来杀死 pod,但它没有帮助。

稍微不同的是,pod 内存限制为 2Gi。但是,pod 在大约 700Mi 时出现 OOM 异常,所以我认为不是内存不足,只是 pod 在尝试扩展内存之前抛出异常:

    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: 10m
        memory: 128Mi

我也测试了 -XX:+CrashOnOutOfMemoryError 但它无助于解决我的情况,并且 pod 在尝试关闭容器时不断抛出 OOM。

Minikube 可能没有足够的内存。

您可以通过以下方式检查内存设置 minikube 启动-h

然后 minikube 停止 && minikube 启动 --cpus 4 --memory 2048

更新: 尝试在启动 java application

时设置堆大小 -Xms1g -Xmx2g

我意识到我的配置中有错别字。它必须是 ExitOnOutOfMemoryError 而不是 ExitOnOutofMemoryError。这是我正在使用的参数,它们运行顺利:

-XX:InitialRAMPercentage=20.0 -XX:MinRAMPercentage=50.0 -XX:MaxRAMPercentage=80.0 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError