@LocatorApplication 启动然后立即停止

@LocatorApplication starts and then immediately stops

一切似乎都创建得很好,但一旦它完成初始化一切,它就会停止。

@SpringBootApplication
@LocatorApplication
public class ServerApplication {

  public static void main(String[] args) {
    SpringApplication.run(ServerApplication.class, args);
  }
}

日志:

2020-08-03 10:59:18.250  INFO 7712 --- [           main] o.a.g.d.i.InternalLocator                : Locator started on 10.25.209.139[8081]
2020-08-03 10:59:18.250  INFO 7712 --- [           main] o.a.g.d.i.InternalLocator                : Starting server location for Distribution Locator on LB183054.dmn1.fmr.com[8081]
2020-08-03 10:59:18.383  INFO 7712 --- [           main] c.f.g.l.LocatorSpringApplication         : Started LocatorSpringApplication in 8.496 seconds (JVM running for 9.318)
2020-08-03 10:59:18.385  INFO 7712 --- [m shutdown hook] o.a.g.d.i.InternalDistributedSystem      : VM is exiting - shutting down distributed system
2020-08-03 10:59:18.395  INFO 7712 --- [m shutdown hook] o.a.g.i.c.GemFireCacheImpl               : GemFireCache[id = 1329087972; isClosing = true; isShutDownAll = false; created = Mon Aug 03 10:59:15 EDT 2020; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing.
2020-08-03 10:59:18.416  INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager     : Shutting down DistributionManager 10.25.209.139(locator1:7712:locator)<ec><v0>:41000. 
2020-08-03 10:59:18.517  INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager     : Now closing distribution for 10.25.209.139(locator1:7712:locator)<ec><v0>:41000
2020-08-03 10:59:18.518  INFO 7712 --- [m shutdown hook] o.a.g.d.i.m.g.Services                   : Stopping membership services
2020-08-03 10:59:18.518  INFO 7712 --- [ip View Creator] o.a.g.d.i.m.g.Services                   : View Creator thread is exiting
2020-08-03 10:59:18.520  INFO 7712 --- [Server thread 1] o.a.g.d.i.m.g.Services                   : GMSHealthMonitor server thread exiting
2020-08-03 10:59:18.536  INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager     : DistributionManager stopped in 120ms.
2020-08-03 10:59:18.537  INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager     : Marking DistributionManager 10.25.209.139(locator1:7712:locator)<ec><v0>:41000 as closed.

是的,这是预期的行为,OOTB。

大多数 Apache Geode 进程(客户端(即 ClientCache)、LocatorsManagers 和“peer”Cachenodes/members cluster/distributed 系统)只创建守护进程线程(即 non-blocking 线程)。因此,Apache Geode JVM 进程将启动、初始化并立即关闭。

只有一个 Apache Geode CacheServer 进程(一个“对等”Cache 有一个 CacheServer 组件来侦听客户端连接),启动并继续 运行 .这是因为用于侦听客户端 Socket 连接的 ServerSocket 是在 non-daemon 线程(即阻塞线程)上创建的,这会阻止 JVM 进程关闭。否则,CacheServer 也会直接掉落。

你可能会想,嗯,Gfsh 是如何防止 Locators(即使用 start locator 命令)和“服务器”(即使用start server 命令)从关机?

NOTE: By default, Gfsh creates a CacheServer instance when starting a GemFire/Geode server using the start server command. The CacheServer component of the "server" can be disabled by specifying the --disable-default-server option to the start server command. In this case, this "server" will not be able to serve clients. Still the peer node/member will continue to run, but not without extra help. See here for more details on the start server Gfsh command.

那么,Gfsh 是如何防止进程失败的呢?

Under-the-hood、Gfsh 使用 LocatorLauncher and ServerLauncher classes 来配置和派生 JVM 进程以启动 Locators服务器,分别。

例如,here is Gfsh's start locator command using the LocatorLauncher class. Technically, it uses the configuration from the LocatorLauncher class instance to construct (and specifically, here) the java command-line used to fork and launch (and specifically, here) 一个单独的 JVM 进程。

不过,这里的关键是在启动Locator时传递给LocatorLauncherclass的具体“命令”,也就是START命令(here).

LocatorLauncher class 中,我们看到 START 命令执行以下操作,来自 main method, to the run method, it starts the Locator, then waitsOnLocator (with implementation).

如果没有等待,Locator 会像您遇到的那样直接掉落。

您可以使用以下代码模拟相同的效果(即“直接坠落”),该代码使用 Apache Geode API 配置和启动定位器 (in-process)。

public class ApacheGeodeLocatorApplication {

    public static void main(String[] args) {

        LocatorLauncher locatorLauncher = new LocatorLauncher.Builder()
            .set("jmx-manager", "true")
            .set("jmx-manager-port", "0")
            .set("jmx-manager-start", "true")
            .setMemberName("ApacheGeodeBasedLocator")
            .setPort(0)
            .build();

        locatorLauncher.start();

        //locatorLauncher.waitOnLocator();
    }
}

这个简单的小程序会直接失败。但是,如果您取消注释 locatorLaucncher.waitOnLocator(),那么 JVM 进程将阻塞。

这与 SDG 的 LocatorFactoryBean class (see source) 实际所做的没什么不同。它也使用 LocatorLauncher class 来配置和 bootstrap Locator in-process。 LocatorFactoryBean 是用于配置的 class 和 bootstrap 一个 Locator 在你的 @SpringBootApplication class 上声明 SDG @LocatorApplication 注释时.

不过,我确实认为这里还有改进的余地。因此,我提交了 DATAGEODE-361.

与此同时,作为一种解决方法,您可以通过查看 Smoke Test 来实现与阻塞 Locator 相同的效果Spring 启动 Apache Geode (SBDG) 项目。参见 here

但是,DATAGEODE-361 完成后,将不再需要阻止 Locator JVM 进程关闭的额外逻辑。