LoadBalancerFeignClient 中的 NullPointerException (spring-cloud-netflix)

NullPointerException in LoadBalancerFeignClient (spring-cloud-netflix)

我们在服务中为客户使用 Feign。最近其中一项服务开始随机抛出一些异常,这是由以下原因引起的:

Caused by: java.lang.NullPointerException: null
    at org.springframework.cloud.netflix.feign.ribbon.LoadBalancerFeignClient.execute(LoadBalancerFeignClient.java:63)
    at org.springframework.cloud.sleuth.instrument.web.client.feign.TraceLoadBalancerFeignClient.execute(TraceLoadBalancerFeignClient.java:41)
    at feign.SynchronousMethodHandler.executeAndDecode(SynchronousMethodHandler.java:97)
    at feign.SynchronousMethodHandler.invoke(SynchronousMethodHandler.java:76)
    at feign.hystrix.HystrixInvocationHandler.run(HystrixInvocationHandler.java:108)
    at com.netflix.hystrix.HystrixCommand.call(HystrixCommand.java:301)
    at com.netflix.hystrix.HystrixCommand.call(HystrixCommand.java:297)
    at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46)
    at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
    at rx.Observable.unsafeSubscribe(Observable.java:10211)
    at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51)
    at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35)
    at rx.Observable.unsafeSubscribe(Observable.java:10211)
    at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41)
    at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48)
    at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30)
    at rx.Observable.unsafeSubscribe(Observable.java:10211)
    at rx.internal.operators.OperatorSubscribeOn.call(OperatorSubscribeOn.java:94)
    at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:56)
    at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:47)
    at our.code.hystrix.AuthContextAwareHystrixConcurrencyStrategy$AuthorizationContextAwareCallable.call(AuthContextAwareHystrixConcurrencyStrategy.java:57)
    at org.springframework.cloud.sleuth.instrument.hystrix.SleuthHystrixConcurrencyStrategy$HystrixTraceCallable.call(SleuthHystrixConcurrencyStrategy.java:154)
    at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69)
    at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

我查看了 spring-cloud-netflix-core (v1.2.2.RELEASE) 代码及其依赖项,但无法弄清楚 NPE 发生的原因。 在堆栈跟踪中,它指向 LoadBalancerFeignClient 中的第 63 行,即:

private CachingSpringLoadBalancerFactory lbClientFactory;

@Override
public Response execute(Request request, Request.Options options) throws IOException {
  try {
    URI asUri = URI.create(request.url());
    String clientName = asUri.getHost();
    URI uriWithoutHost = cleanUrl(request.url(), clientName);
    FeignLoadBalancer.RibbonRequest ribbonRequest = new FeignLoadBalancer.RibbonRequest(
        this.delegate, request, uriWithoutHost);

    IClientConfig requestConfig = getClientConfig(options, clientName);
    return lbClient(clientName).executeWithLoadBalancer(ribbonRequest, // Line 63
        requestConfig).toResponse();
  }
  catch (ClientException e) {
    IOException io = findIOException(e);
    if (io != null) {
      throw io;
    }
    throw new RuntimeException(e);
  }
}

private FeignLoadBalancer lbClient(String clientName) {
  return this.lbClientFactory.create(clientName);
}

这意味着只有 lbClient(clientName) 是返回 null 的一个可能位置。查看 CachingSpringLoadBalancerFactory class 及其实现,我在 ConcurrentReferenceHashMap:

的文档中找到了这个

NOTE: The use of references means that there is no guarantee that items placed into the map will be subsequently available. The garbage collector may discard references at any time, so it may appear that an unknown thread is silently removing entries.

现在我的问题是为什么会这样以及如何解决。谢谢。

郑重声明,这是 GitHub 上的问题: https://github.com/spring-cloud/spring-cloud-netflix/issues/2443 虽然在新版本中修复了。