重启 RPC 服务

Restarting RPC service

我有一个客户端进程派生一个子进程以通过 svc_run() 方法侦听传入的 RPC。我需要做的是从父进程中杀死那个子进程,然后重新派生子进程,为它提供一个新的 CLIENT* 到一个新的 RPC 服务器。

这是我的代码中相关的部分:

// Client Main
CLIENT* connectionToServer;
int pipe[2];
int childPID;
int parentPID;

static void usr2Signal()
{
  ServerData sd;
  clnt_destroy(connectionToServer);
  (void) read(pipe[0], &sd, sizeof(sd));


  // Kill child process.
  kill(childPID, SIGTERM);
  close(pipe[0]);


  // RPC connection to the new server
    CLIENT *newServerConn =
        clnt_create(
          sd.ip,
          sd.programNum,
          1,
          "tcp");

    if (!newServerConn)
    {
        // Connection error.
        exit(1);
    }

    connectionToServer = newServerConn;


  // Respawn child process.
  if (pipe(pipe) == -1)
  {
      // Pipe error.
      exit(2);
  }

  childPID = fork();
  if (childPID == -1)
  {
    // Fork error.
    exit(3);
  }
  if (childPID == 0)
  {
    // child closes read pipe and listens for RPCs.
      close(pipe[0]);
      parentPID = getppid();
      svc_run();
  }
  else
  {
    // parent closes write pipe and returns to event loop.
    close(pipe[1]);
  }
}

int main(int argc, char *argv[])
{
    /* Some initialization code */

    transp = svctcp_create(RPC_ANYSOCK, 0, 0);
    if (transp == NULL) {
        // TCP connection error.
        exit(1);
    }

    if (!svc_register(transp, /*other RPC program args*/, IPPROTO_TCP))
    {
        // RPC register error
        exit(1);
    }



  connectionToServer = clnt_create(
        192.168.x.xxx, // Server IP.
        0x20000123,     // Server RPC Program Number
        1,              // RPC Version
        "tcp");

  if (!connectionToServer)
  {
    // Connection error
    exit(1);
  }

  // Spawn child process first time.
  if (pipe(pipe) == -1) 
  {
    // Pipe error
    exit(1);
  }

  childPID = fork();
  if (childPID == -1)
  {
    // Fork error.
    exit(1);
  }

  if (childPID == 0)
  {
    // Close child's read pipe.
    close(pipe[0]);
    parentPID = getppid();

    // Listen for incoming RPCs.
    svc_run ();
    exit (1);
  }


  /* Signal/Communication Code */

  // Close parent write pipe.
  close(pipe[1]);

  // Parent runs in event loop infinitely until a signal is sent.
  eventLoop();
  cleanup();
}

在我的服务器代码中,我有启动新连接的服务调用。此调用由服务器上的某些其他操作调用。

// Server Services
void newserverconnection_1_svc(int *unused, struct svc_req *s)
{
    // This service is defined in the server code

    ServerData sd;
    /* Fill sd with data:
         Target IP: 192.168.a.aaa
         RPC Program Number: 0x20000321
         ... other data
    */

    connecttonewserver_1(&sd, connectionToServer); // A client service.
}

回到我的客户端,我有以下服务:

// Client Service
void connecttonewserver_1_svc(ServerData *sd, struct svc_req *s)
{
    // Send the new server connection data to the parent client processs
    // via the pipe and signal the parent.
    write(pipe[1], sd, sizeof(sd));
    kill(parentPID, SIGUSR2);
}

我的问题是,在我启动新连接之前一切正常。我没有进入任何错误部分,但是在建立新连接后大约 5 秒,我的客户端变得没有响应。它没有崩溃,子进程似乎也还活着,但是当我在事件循环中为父进程定义的事件被鼠标点击触发时,我的客户端将不再接收 RPC 或显示任何打印语句。我可能在为子进程生成这个新的 RPC 循环时做错了一些事情,但我看不出是什么。有什么想法吗?

所以这个解决方案达到了我正在寻找的结果,但肯定远非完美。

static void usr2Signal()
{
  ServerData sd;
  // clnt_destroy(connectionToServer); // Removed this as it closes the RPC connection.
  (void) read(pipe[0], &sd, sizeof(sd));


  // Removed these. Killing the child process also seems to close the
  // connection. Just let the child run.
  // kill(childPID, SIGTERM);
  // close(pipe[0]);


  // RPC connection to the new server
    CLIENT *newServerConn =
        clnt_create(
          sd.ip,
          sd.programNum,
          1,
          "tcp");

    if (!newServerConn)
    {
        // Connection error.
        exit(1);
    }

    // This is the only necessary line. Note that the old 
    // connectionToServer pointer was not deregistered/deallocated,
    // so this causes a memory leak, but is a quick fix to my issue.
    connectionToServer = newServerConn;


    // Removed the rest of the code that spawns a new child process
    // as it is not needed anymore.

}