如何在 AWS Lambda 函数中将整个程序打包或安装到 运行

How can I package or install an entire program to run in an AWS Lambda function

如果这是完全错误地使用 Lambda 的情况,请告诉我。

我要安装Scrapy into a Lambda function and invoke the function to begin a crawl. My first problem is how to install it, so that all of the paths are correct. I installed the program using the directory to be zipped as its root, so the zip contains all of the source files and the executable. I am basing my efforts on this文章。在它说要包含在我的函数开头的行中,"process" 变量来自哪里?我试过了,

var process = require('child_process');
var exec = process.exec;
process.env['PATH'] = process.env['PATH'] + ':' + 
process.env['LAMBDA_TASK_ROOT']

但我收到错误,

"errorMessage": "Cannot read property 'PATH' of undefined",
"errorType": "TypeError",

我需要包含所有库文件,还是只包含 /usr/lib 中的可执行文件?如何包含文章中说我需要的那一行代码?

编辑: 我尝试将代码移至 child_process.exec,并收到错误

"errorMessage": "Command failed: /bin/sh: process.env[PATH]: command not found\n/bin/sh: scrapy: command not found\n"

这是我当前的全部功能

console.log("STARTING");
var process = require('child_process');
var exec = process.exec;

exports.handler = function(event, context) {    
    //Run a fixed Python command.
    exec("process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT']; scrapy crawl backpage2", function(error, stdout) {
        console.log('Scrapy returned: ' + stdout + '.');
        context.done(error, stdout);
    });

};

这是一个运行 python 脚本的 Lambda 函数,将当前工作目录设置为与 Lambda 函数相同的目录。您可以通过对 python 脚本的相对位置进行一些修改来使用它。

var child_process = require("child_process");

exports.handler = function(event, context) {
    var execOptions = {
        cwd: __dirname
    };
    child_process.exec("python hello.py", execOptions, function (error, stdout, stderr) {
        if (error) {
            context.fail(error);
        } else {
            console.log("stdout:\n", stdout);
            console.log("stderr:\n", stderr);
            context.succeed();
        }
    });
};

您绝对可以在 Lambda 中从您的 Node 代码执行任意进程。例如,我们正在对 运行 处理玩家回合的游戏服务器执行此操作。我不确定你到底想用 Scrapy 完成什么,但请记住,你的整个 Lambda 调用现在在 AWS 上最多只能存在 60 秒!但如果这对您来说没问题,这里有一个完整的示例,说明我们如何从 Lambda 执行我们自己的任意 linux 进程。 (在我们的例子中,它是一个编译的二进制文件——真的没关系,只要你有一些东西可以 运行 在他们使用的 Linux 图像上)。

var child_process = require('child_process');
var path = require('path');

exports.handler = function (event, context) {
    // If timeout is provided in context, get it. Otherwise, assume 60 seconds
    var timeout = (context.getRemainingTimeInMillis && (context.getRemainingTimeInMillis() - 1000)) || 60000;
    // The task root is the directory with the code package.
    var taskRoot = process.env['LAMBDA_TASK_ROOT'] || __dirname;
    // The command to execute.
    var command;

    // Set up environment variables
    process.env.HOME = '/tmp'; // <-- for naive processes that assume $HOME always works! You might not need this.

    // On linux the executable is in task root / __dirname, whichever was defined
    process.env.PATH += ':' + taskRoot;
    command = 'bash -c "cp -R /var/task/YOUR_THING /tmp/; cd /tmp; ./YOUR_THING ARG1 ARG2 ETC"'

    child_process.exec(command, {
        timeout: timeout,
        env: process.env
    }, function (error, stdout, stderr) {
        console.log(stdout);
        console.log(stderr);
        context.done(null, {exit: true, stdout: stdout, stderr: stderr});
    });
};

您的示例的问题是将 node global variable process 修改为

var process = require('child_process');

这样你就不能改变 PATH 环境变量,因此你得到 Cannot read property 'PATH' of undefined.

的原因

只需为加载的 child_process 库使用不同的名称,例如

//this gets updated to child_process
var child_process = require('child_process');
var exec = child_process.exec;

//global process variable is still accessible
process.env['PATH'] = process.env['PATH'] + ':' + 
process.env['LAMBDA_TASK_ROOT']

//start new process with your binary
exec('path/to/your/binary',......