为什么两个进程之间的管道数据太大时似乎被截断了?

Why does the piped data between two process seem truncated when too big?

我们最近在尝试使子进程将整个 base64 编码图片(大约 355K)传输到它的父进程时在我们的项目中遇到了一个问题:但是图片似乎被随机截断了,我们仍然没有得到这种行为,也没有找到一个解决方案。

我们找到了使用基于临时文件的通信来传输这些图片的解决方法,但我们仍然想了解有关这些进程间通信限制的问题。

这是我们成功生成的最接近的最小且可重现的示例,它突出了这种行为,我们有一个 python 脚本试图从生成要检索的数据的节点子进程中检索数据。 但是父进程能够获取的数据长度似乎以不确定的方式受到限制。

本例测试请求的数据长度与实际检索到的长度是否相等。

#!/usr/bin/env python3

import base64
import sys
import json
import subprocess

def test(l, executable):
    process = subprocess.Popen(
        executable,
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )
    stdout, stderr = process.communicate(input=json.dumps(l).encode())
    exit_code = process.returncode

    if exit_code != 0:
        raise RuntimeError("fail  : " + str(stderr))

    result = base64.b64decode(stdout.decode("utf-8"))
    assert len(result) == l, f"{len(result)} != {l}"
    print(f"Success: {len(result)} == {l}")

if __name__ == "__main__":
    l = int(sys.argv[1]) if len(sys.argv) > 1 else 355000
    try:
        test(l, ["./test.js"])
    except AssertionError as e:
        print("fail :", e)
#!/usr/bin/env node

const http = require("http");
const serveHandler = require("serve-handler");
const btoa = require("btoa");

const EXIT_CODE_SUCCESS = 0;
const EXIT_CODE_ERROR = 4;


async function getDataFromStdin() {
    return new Promise((resolve, reject) => {
        let receivedData = '';

        process.stdin.on("data", chunk => {
            receivedData += chunk.toString();
        });

        process.stdin.on("end", () => {
            result = resolve(JSON.parse(receivedData)); 
            return result;
        });
    })
}

async function main(){
    const len  = await getDataFromStdin();
    const base64 = btoa("0".repeat(Number(len)));
    process.stdout.write(base64);    
}

let errorCode = EXIT_CODE_SUCCESS;
main()
    .catch(err => {
        console.error(err);
        errorCode = EXIT_CODE_ERROR;
    }).finally(() => {
        process.exit(errorCode);
    });
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 1
Success: 1 == 1
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 1000
Success: 1000 == 1000
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 30000
Success: 30000 == 30000
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 60000
fail : 49152 != 60000
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 60000
Success: 60000 == 60000
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 120000
fail : 49152 != 120000
vagrant@sc-dev-machine:/home/vagrant $ ./test.py 120000
fail : 98304 != 120000
vagrant@sc-dev-machine:/home/vagrant $ 

我们也尝试了基于 subprocess.check_output() 的解决方案,但没有更好的结果。

对此有何解释? EOF 终止了进程之间和通过管道的数据块?缓冲(我们怀疑是原因)不应该能够传输整个数据吗?

是否有一种经过验证的方法可以在不受长度限制的情况下通过进程传输数据(如文件或图片)?


编辑: 这里还有一些关于环境的信息:

vagrant@sc-dev-machine:/home/vagrant $ uname -a
Linux sc-dev-machine 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
vagrant@sc-dev-machine:/home/vagrant $ python3 --version
Python 3.6.8

问题出在您的 JavaScript 代码中,您可以找到解释 here :

Calling process.exit() will force the process to exit as quickly as possible even if there are still asynchronous operations pending that have not yet completed fully, including I/O operations to process.stdout and process.stderr.

和:

In most situations, it is not actually necessary to call process.exit() explicitly. The Node.js process will exit on its own if there is no additional work pending in the event loop. The process.exitCode property can be set to tell the process which exit code to use when the process exits gracefully.

您在对 process.stdout.write() 的调用完成之前调用 process.exit() (writing to pipes is asynchronous on POSIX)。这会导致JS进程过早退出,并在所有数据写入之前中断写入。

如果你想设置错误代码,你应该像你一样设置process.exitCode = errorCode并允许事件循环优雅地结束而不调用process.exit()