在 SML 中捕获命令的标准输出

Capturing stdout of a command in SML

我正在尝试使用 Posix.Process.execp 捕获命令 运行 的输出。我移植了一些我在Whosebug上找到的C代码,并且可以捕获一次执行的输出,但是我无法获得第二次执行的输出。

这是我的函数:

(* Runs a command c (command and argument list) using Posix.Process.execp. *)
(* If we successfully run the program, we return the lines output to stdout *)
(* in a list, along with SOME of the exit code. *)
(* If we fail to run the program, we return the error message in the list *)
(* and NONE. *)
fun execpOutput (c : string * string list) : (string list * Posix.Process.exit_status option) =
  let fun readAll () = case TextIO.inputLine TextIO.stdIn
                    of SOME s => s :: (readAll ())
                     | NONE => []
      (* Create a new pipe *)
      val { infd = infd, outfd = outfd } = Posix.IO.pipe ()
  in case Posix.Process.fork ()
      of NONE => (
      (* We are the child. First copy outfd to stdout; they will *)
      (* point to the same file descriptor and can be used interchangeably. *)
      (* See dup(2) for details. Then close infd: we don't need it and don't *)
      (* want to block because we have open file descriptors laying around *)
      (* when we want to exit. *)
      ( Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stdout }
      ; Posix.IO.close infd
      ; Posix.Process.execp c )
      handle OS.SysErr (err, _) => ([err], NONE) )
       | SOME pid =>
     (* We are the parent. This time, copy infd to stdin, and get rid of the *)
     (* outfd we don't need. *)
     let val _ = ( Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }
                 ; Posix.IO.close outfd )
         val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
     in (readAll (), SOME status) end
  end

val lsls = (#1 (execpOutput ("ls", ["ls"]))) @ (#1 (execpOutput ("ls", ["ls"])))
val _ = app print lsls

这是相应的输出:

rak@zeta:/tmp/test$ ls
a  b  c
rak@zeta:/tmp/test$ echo 'use "/tmp/mwe.sml";' | sml
Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
- [opening /tmp/mwe.sml]
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
a
b
c
val execpOutput = fn
  : string * string list -> string list * ?.POSIX_Process.exit_status option
val lsls = ["a\n","b\n","c\n"] : string list
val it = () : unit
-

对我做错了什么有什么建议吗?

这不是一个完整的答案。

如果您 运行 sml 交互,您会注意到解释器在第一次调用后退出:

$ sml
Standard ML of New Jersey v110.81 [built: Wed May 10 21:25:32 2017]
- use "mwe.sml";
[opening mwe.sml]
...
- execpOutput ("ls", ["ls"]);
val it = (["mwe.sml\n"],SOME W_EXITED)
  : string list * ?.POSIX_Process.exit_status option
- 
$ # I didn't type exit ();

您的程序似乎是对 this pipe/fork/exec C 中确实有效的示例的改编。唯一明显的区别是 C 行 fdopen(pipefd[0], "r"),您在其中编写 Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }.

我可能会调查这些是否真的旨在提供相同的结果。您可以在每个程序上 运行 strace 并查看它们的系统调用何时偏离。不过,我还没有进一步了解它。

我试图 运行 strace sml nwe.sml 2>&1 | grep -v getrusage 于:

fun readAll () = case TextIO.inputLine TextIO.stdIn
                   of SOME s => s :: readAll ()
                    | NONE => []

fun execpOutput (c : string * string list) : (string list * Posix.Process.exit_status option) =
  let val { infd = infd, outfd = outfd } = Posix.IO.pipe ()
  in case Posix.Process.fork ()
          (* Child *)
       of NONE => (( Posix.IO.close infd
                   ; Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stdout }
                   ; Posix.IO.dup2 { old = outfd, new = Posix.FileSys.stderr }
                   ; Posix.Process.execp c )
                  handle OS.SysErr (err, _) => ([err], NONE) )
         (* Parent *)
        | SOME pid =>
          let val _ = Posix.IO.close outfd
              val _ = Posix.IO.dup2 { old = infd, new = Posix.FileSys.stdin }
              val _ = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
          in readAll () end
  end

val _ = app print (execpOutput ("ls", ["ls"]));
val _ = app print (execpOutput ("ls", ["ls"]));

正如我在

的编译输出上尝试 运行 strace ./mwe
#include <stdio.h>
#include <stdlib.h>
#include <spawn.h>
#include <sys/wait.h>
#include <unistd.h>

void execpOutput(char *cmd[])
{
    pid_t pid;
    int pipefd[2];
    FILE* output;
    char line[256];
    int status;

    pipe(pipefd);
    pid = fork();
    if (pid == 0) {
        close(pipefd[0]);
        dup2(pipefd[1], STDOUT_FILENO);
        dup2(pipefd[1], STDERR_FILENO);
        execvp(cmd[0], cmd);
    } else {
        close(pipefd[1]);
        output = fdopen(pipefd[0], "r");
        while (fgets(line, sizeof(line), output)) {
            printf("%s", line);
        }
        waitpid(pid, &status, 0);
    }
}

int main(int argc, char *argv[])
{
    char *cmd[] = {"ls", NULL};

    execpOutput(cmd);
    execpOutput(cmd);
    return 2;
}

我最初的尝试包括

  1. 创建管道
  2. 将 child 的标准输出设置为管道的写入端
  3. 将 parent 的标准输入设置为管道的读取端

这第二次没有工作,可能是因为某些竞争条件(运行将其置于 strace -f 下意味着我们可以看到第二个 child 写入第二个管道的写入端,但 parent 从未设法从第二个管道的读取端读取)。我意识到这种方法也是次优的,因为它涉及破坏标准输入。

我的同事指出我实际上是在尝试实施 popen(3) 的变体。实际上,更好的方法是为所需的管道末端实现 popen 和 return 文件描述符,而不是破坏 parent 的 stdin/stdout。它也是对称的,因为用户可以指定他们是想要管道的读端还是写端。这是我想出的(欢迎反馈)。

structure Popen :>
      sig
          (* Parent wants to write to stdin, read stdout, or read stdout + stderr *)
          datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE
          val popen : string * pipe_type -> Posix.IO.file_desc
          val pclose : Posix.IO.file_desc -> Posix.Process.exit_status option
      end =
struct

datatype pipe_type = PIPE_W | PIPE_R | PIPE_RE

type pinfo = { fd : Posix.ProcEnv.file_desc, pid : Posix.Process.pid }

val pids : pinfo list ref = ref []

(* Implements popen(3) *)
fun popen (cmd, t) =
  let val { infd = readfd, outfd = writefd } = Posix.IO.pipe ()
  in case (Posix.Process.fork (), t)
      of (NONE, t) => (* Child *)
     (( case t
         of PIPE_W => Posix.IO.dup2 { old = readfd, new = Posix.FileSys.stdin }
          | PIPE_R => Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
          | PIPE_RE => ( Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stdout }
                       ; Posix.IO.dup2 { old = writefd, new = Posix.FileSys.stderr })
      ; Posix.IO.close writefd
      ; Posix.IO.close readfd
      ; Posix.Process.execp ("/bin/sh", ["sh", "-c", cmd]))
      handle OS.SysErr (err, _) =>
             ( print ("Fatal error in child: " ^ err ^ "\n")
             ; OS.Process.exit OS.Process.failure ))
       | (SOME pid, t) => (* Parent *)
     let val fd = case t of PIPE_W => (Posix.IO.close readfd; writefd)
                          | PIPE_R => (Posix.IO.close writefd; readfd)
                          | PIPE_RE => (Posix.IO.close writefd; readfd)
         val _ = pids := ({ fd = fd, pid = pid } :: !pids)
     in fd end
  end

(* Implements pclose(3) *)
fun pclose fd =
  case List.partition (fn { fd = f, pid = _ } => f = fd) (!pids)
   of ([], _) => NONE
    | ([{ fd = _, pid = pid }], pids') =>
      let val _ = pids := pids'
      val (_, status) = Posix.Process.waitpid (Posix.Process.W_CHILD pid, [])
      val _ = Posix.IO.close fd
      in SOME status end
    | _ => raise Bind (* This should be impossible. *)
end

val f = Popen.popen("ls", Popen.PIPE_R);
val g = Popen.popen("read line; echo $line>/tmp/foo", Popen.PIPE_W);
val _ = Posix.IO.writeVec (g, Word8VectorSlice.full (Byte.stringToBytes "Hello World! I was written by g\n"));
val h = Popen.popen("cat /tmp/foo", Popen.PIPE_R);
val i = Popen.popen("echo 'to stderr i' 1>&2", Popen.PIPE_R);
val j = Popen.popen("echo 'to stderr j' 1>&2", Popen.PIPE_RE);
val _ = app (fn fd => print (Byte.bytesToString (Posix.IO.readVec (fd, 1000)))) [f, h, i, j];
val _ = map Popen.pclose [f, g, h, i, j];
val _ = OS.Process.exit OS.Process.success;

然后输出是:

rak@zeta:~/popen$ rm /tmp/foo && ls && sml popen.sml
popen.sml
Standard ML of New Jersey v110.79 [built: Tue Aug  8 16:57:33 2017]
[opening popen.sml]
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
popen.sml:42.52 Warning: calling polyEqual
structure Popen :
  sig
    datatype pipe_type = PIPE_R | PIPE_RE | PIPE_W
    val popen : string * pipe_type -> ?.POSIX_IO.file_desc
    val pclose : ?.POSIX_IO.file_desc -> ?.POSIX_Process.exit_status option
  end
val f = FD {fd=4} : ?.POSIX_IO.file_desc
val g = FD {fd=6} : ?.POSIX_IO.file_desc
[autoloading]
[autoloading done]
val h = FD {fd=5} : ?.POSIX_IO.file_desc
to stderr i
val i = FD {fd=7} : ?.POSIX_IO.file_desc
val j = FD {fd=8} : ?.POSIX_IO.file_desc
popen.sml
Hello World! I was written by g
to stderr j

感谢 Simon Shine 对 运行 strace 的提示。我仍然不确定为什么我的方法不起作用,但至少我们知道发生了什么。