如何在一个程序关闭并重新启动而另一个程序保持运行在Java后恢复RMI通信?
How to resume RMI communication after a program shuts down and restarts while the other remains running in Java?
所以我目前有很多代码,很难将它们全部分解成一个 SSCCE,但如果有必要,我可能稍后会尝试这样做。
无论如何,这里是要点:我有两个进程通过 RMI 进行通信。有用。但是,如果主机进程 (JobViewer) 退出通信,我希望能够继续,然后 returns 全部在客户端进程 (Job) 的生命周期中。
目前,每次作业启动时,我都会将绑定名称保存到文件中,并且 JobViewer 在启动时打开该文件。效果很好,正确的绑定名称有效。但是,每次我尝试恢复与 JobViewer 重新启动时实际上仍然是 运行 的作业的通信时,我都会收到 NotBoundException
。
我的 JobViewer 使用以下方法实现了一个扩展 Remote 的接口:
public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
我的 Job 还实现了一个不同的接口,它使用以下方法扩展 Remote:
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我该如何实现?这是我当前的一些初始化 RMI 的代码,如果有帮助的话...
JobViewer 端:
private Registry _registry;
// Set up RMI
_registry = LocateRegistry.createRegistry(2002);
_registry.rebind("JOBVIEWER_SERVER", this);
工作方面:
private NiceRemoteJobMonitor _server;
Registry registry = LocateRegistry.getRegistry(hostName, port);
registry.rebind(_bindedClientName, this);
Remote remoteServer = registry.lookup(masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
I get a NotBoundException
every time I try to resume communication with a Job that I know for fact is still running when the JobViewer restarts.
这只有在 JobViewer
启动时没有重新绑定自身的情况下才会发生。当您使用陈旧的存根时,通常会得到 NoSuchObjectException
,即远程对象已退出的存根。在这种情况下,您应该重新获取存根,即重做 lookup()
.
为什么客户端将自己绑定到注册表?如果要注册回调,只需将 this
传递给 registerClient()
方法而不是绑定名称,并相应地调整其签名(使用客户端的远程接口作为参数类型)。无需让服务器查找客户端注册表。根本不需要客户端注册表。
我的解决方案是让 Job 每隔一段时间 ping JobViewer:
while (true) {
try {
_server.ping();
// If control reaches here we were able to successfully ping the job monitor.
} catch (Exception e) {
System.out.println("Job lost contact with the job monitor at " + new Date().toString() + " ...");
// If control reaches we were unable to ping the job monitor. Now we will loop until it presumably comes back to life.
boolean foundServer = false;
while (!foundServer) {
try {
// Attempt to register again.
Registry registry = LocateRegistry.getRegistry(_hostName, _port);
registry.rebind(_bindedClientName, NiceSupervisor.this);
Remote remoteServer = registry.lookup(_masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
// Ping the server for good measure.
_server.ping();
System.out.println("Job reconnected with the job monitor at " + new Date().toString() + " ...");
// If control reaches here we were able to reconnect to the job monitor and ping it again.
foundServer = true;
} catch (Exception x) {
System.out.println("Job still cannot contact the job monitor at " + new Date().toString() + " ...");
}
// Sleep for 1 minute before we try to locate the registry again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException x) {
}
} // End of endless loop until we find the server again.
}
// Sleep for 1 minute after we ping the server before we try again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException e) {
}
} // End of endless loop that we never exit.
所以我目前有很多代码,很难将它们全部分解成一个 SSCCE,但如果有必要,我可能稍后会尝试这样做。
无论如何,这里是要点:我有两个进程通过 RMI 进行通信。有用。但是,如果主机进程 (JobViewer) 退出通信,我希望能够继续,然后 returns 全部在客户端进程 (Job) 的生命周期中。
目前,每次作业启动时,我都会将绑定名称保存到文件中,并且 JobViewer 在启动时打开该文件。效果很好,正确的绑定名称有效。但是,每次我尝试恢复与 JobViewer 重新启动时实际上仍然是 运行 的作业的通信时,我都会收到 NotBoundException
。
我的 JobViewer 使用以下方法实现了一个扩展 Remote 的接口:
public void registerClient(String bindedName, JobStateSummary jobSummary) throws RemoteException, NotBoundException;
public void giveJobStateSummary(JobStateSummary jobSummary) throws RemoteException;
public void signalEndOfClient(JobStateSummary jobSummary) throws RemoteException;
我的 Job 还实现了一个不同的接口,它使用以下方法扩展 Remote:
public JobStateSummary getJobStateSummary() throws RemoteException;
public void killRemoteJob() throws RemoteException;
public void stopRemoteJob() throws RemoteException;
public void resumeRemoteJob() throws RemoteException;
我该如何实现?这是我当前的一些初始化 RMI 的代码,如果有帮助的话...
JobViewer 端:
private Registry _registry;
// Set up RMI
_registry = LocateRegistry.createRegistry(2002);
_registry.rebind("JOBVIEWER_SERVER", this);
工作方面:
private NiceRemoteJobMonitor _server;
Registry registry = LocateRegistry.getRegistry(hostName, port);
registry.rebind(_bindedClientName, this);
Remote remoteServer = registry.lookup(masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
I get a
NotBoundException
every time I try to resume communication with a Job that I know for fact is still running when the JobViewer restarts.
这只有在 JobViewer
启动时没有重新绑定自身的情况下才会发生。当您使用陈旧的存根时,通常会得到 NoSuchObjectException
,即远程对象已退出的存根。在这种情况下,您应该重新获取存根,即重做 lookup()
.
为什么客户端将自己绑定到注册表?如果要注册回调,只需将 this
传递给 registerClient()
方法而不是绑定名称,并相应地调整其签名(使用客户端的远程接口作为参数类型)。无需让服务器查找客户端注册表。根本不需要客户端注册表。
我的解决方案是让 Job 每隔一段时间 ping JobViewer:
while (true) {
try {
_server.ping();
// If control reaches here we were able to successfully ping the job monitor.
} catch (Exception e) {
System.out.println("Job lost contact with the job monitor at " + new Date().toString() + " ...");
// If control reaches we were unable to ping the job monitor. Now we will loop until it presumably comes back to life.
boolean foundServer = false;
while (!foundServer) {
try {
// Attempt to register again.
Registry registry = LocateRegistry.getRegistry(_hostName, _port);
registry.rebind(_bindedClientName, NiceSupervisor.this);
Remote remoteServer = registry.lookup(_masterName);
_server = (NiceRemoteJobMonitor)remoteServer;
_server.registerClient(_bindedClientName, _jobStateSummary);
// Ping the server for good measure.
_server.ping();
System.out.println("Job reconnected with the job monitor at " + new Date().toString() + " ...");
// If control reaches here we were able to reconnect to the job monitor and ping it again.
foundServer = true;
} catch (Exception x) {
System.out.println("Job still cannot contact the job monitor at " + new Date().toString() + " ...");
}
// Sleep for 1 minute before we try to locate the registry again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException x) {
}
} // End of endless loop until we find the server again.
}
// Sleep for 1 minute after we ping the server before we try again.
try {
Thread.currentThread().sleep(PING_WAIT_TIME);
} catch (InterruptedException e) {
}
} // End of endless loop that we never exit.