为什么这个 PhantomJS 进程会导致 "Directory <x> does not exist." 错误?

Why does this PhantomJS process cause "Directory <x> does not exist." errors?

这是我的第一个 post,如果问题需要修改,我们深表歉意。我已经尽可能地简化了这个问题,但是这里有很多组件,所以这个 post 相当大...

我们的 ASP.NET MVC 站点在 Azure 上部署为应用服务。我正在使用 API 控制器方法生成同一站点上存在的页面的 PDF。为此,控制器创建一个 PhantomJS 进程,等待成功,然后 returns 它创建的文件的内容。这一切工作正常,但之后网站上的几个视图产生如下错误:

Server Error in '/' Application.

Directory 'D:\home\site\wwwroot\Views\Location' does not exist. Failed to start monitoring file changes.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Web.HttpException: Directory 'D:\home\site\wwwroot\Views\Location' does not exist. Failed to start monitoring file changes.

一段时间后,错误发生变化:

Server Error in '/' Application.

The view 'LocationList' or its master was not found or no view engine supports the searched locations. The following locations were searched:
~/Views/Location/LocationList.aspx
~/Views/Location/LocationList.ascx
~/Views/Shared/LocationList.aspx
~/Views/Shared/LocationList.ascx
~/Views/Location/LocationList.cshtml
~/Views/Location/LocationList.vbhtml
~/Views/Shared/LocationList.cshtml
~/Views/Shared/LocationList.vbhtml

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.InvalidOperationException: The view 'LocationList' or its master was not found or no view engine supports the searched locations. The following locations were searched:
~/Views/Location/LocationList.aspx
~/Views/Location/LocationList.ascx
~/Views/Shared/LocationList.aspx
~/Views/Shared/LocationList.ascx
~/Views/Location/LocationList.cshtml
~/Views/Location/LocationList.vbhtml
~/Views/Shared/LocationList.cshtml
~/Views/Shared/LocationList.vbhtml

这仅适用于尚未编译 的视图,或之前未访问过的任何其他文件。修复它的唯一方法是手动停止和启动 Web 应用程序。我可以确认这不会发生在所有进程中(运行 "echo.exe" 而不是 "phantomjs.exe" 不会导致损坏的行为)。

我查看了所有我能想到的日志,没有发现任何异常。我最好的猜测是一个进程被强行或意外终止,但至于是什么以及为什么,我不知道。也许有一些我不知道的关键日志?

这是相关的 c# 代码:

private static async Task<int> ExecuteSimpleAsync(string workingDir, double? timeout,
    string command, params string[] parameters)
{
    var paramStr = string.Join(" ", parameters.Select(x => x == null ? "" : $"\"{x}\"").ToList());
    var processInfo = new ProcessStartInfo(command, paramStr) {
        WorkingDirectory = workingDir,
        UseShellExecute  = false,                    
        CreateNoWindow   = true,
    };

    Process process = null;
    int exitCode = -1;
    using (process = new Process() { StartInfo = processInfo }) {
        process.Start();
        await process.WaitForExitAsync(timeout); // simple extension function to check for 'Process.HasExited' periodically
        exitCode = process.ExitCode;
    }
    return exitCode;
}


private static async Task<byte[]> GetFileContents(string filePath) {
    byte[] bytes = null;
    using (FileStream file = new FileStream(filePath, FileMode.Open, FileAccess.Read)) {
        bytes = new byte[file.Length];
        await file.ReadAsync(bytes, 0, (int) file.Length);
    }
    return bytes;
}


public static async Task<byte[]> RenderPdfAsync(
    string cookiesB64, string localUrl, string baseFilename, double? timeout = 60)
{
    ....

    // filesPath:  (directory for temporary output)
    // timeout:    60.000 (60 seconds)
    // PhantomJSExePath: (absolute path containing 'phantomjs.exe')
    // scriptFile: "rasterize_simple.js"
    // requestUrl: "TestReport/ForUserAndTestPdf/1002/10"
    // outputFile: "phantomjs-output-<timestamp>.pdf"
    // cookiesB64: (base64-encoded authentication cookies passed to request in PhantomJS)

    var exitCode = await ExecuteSimpleAsync(filesPath, timeout, PhantomJSExePath + @"\phantomjs.exe",
    scriptFile, requestUrl, outputFile, cookiesB64);
    if (exitCode != 0)
        return null;
    return await GetFileContents(outputFile);
}


[Authorize]
[HttpGet]
[Route("TestReport/ForUserAndTestPdf/{userId}/{testId}")]
public async Task<HttpResponseMessage> ForUserAndTestPdfAsync(int userId, int testId) {
    // produce a slightly-modified version of the current URL:
    //    /TestReport/ForUserAndTest/<userid>/<testid>
    // => /TestReport/ForUserAndTestPdf/<userid>/<testid>?print=true
    var url = Request.RequestUri.GetLocalPathWithParams("print=true").Replace("ForUserAndTest", "ForUserAndTestPdf");

    // get the cookies used in the current request and convert to a base64-encoded JSON object
    var cookiesB64 = Request.GetCookiesJsonB64();
    var bytes = await PhantomJSHelpers.RenderPdfAsync(cookiesB64, url, "phantomjs-output", 60);

    var message = new HttpResponseMessage(HttpStatusCode.OK);
    message.Content = new StreamContent(new MemoryStream(bytes));
    message.Content.Headers.ContentLength = bytes.Length;
    message.Content.Headers.ContentType = new MediaTypeHeaderValue("application/pdf");
    return message;
}

这是 PhantomJS 使用的 "rasterize_simple.js" 脚本的相关部分,没有设置页面大小、cookie 等:

page.open(address, function(status) {
    page.render(outputFilename);
    phantom.exit(0);
});

所有这一切的预期结果是它生成的 PDF 文件,并且所有对该 API 方法(使用不同参数)的后续调用都完美无缺。但是,副作用是网站完全损坏:(

如有任何帮助,我们将不胜感激!

恐怕您的 ASP.NET 应用程序的功能无法在 Azure WebApp 中正常工作,例如将进程分叉到 运行 PhantomJS 并生成 PDF 文件,因为那里有很多限制不允许这样做,请参阅 Kudu wiki 页面 Azure Web App sandbox 了解更多信息。

这是我认为你有的一些限制。

  1. PDF generation from HTML There are multiple libraries used to convert HTML to PDF. Many Windows/.NET specific versions leverage IE APIs and therefore leverage User32/GDI32 extensively. These APIs are largely blocked in the sandbox (regardless of plan) and therefore these frameworks do not work in the sandbox.

  2. Unsupported frameworks Here is a list of frameworks and scenarios that have been found to be not be usable due to one or more of the restrictions above. It's conceivable that some will be supported in the future as the sandbox evolves.

    PDF generators failing due to restriction mentioned above:

    Syncfusion Siberix Spire.PDF The following PDF generators are supported:

    SQL Reporting framework: requires the site to run in Basic or higher (note that this currently does not work in Functions apps in Consumptions mode) EVOPDF: See http://www.evopdf.com/azure-html-to-pdf-converter.aspx for vendor solution Telerik reporting: requires the site to run in Basic or higher. More info here Rotativa / wkhtmltopdf: requires the site to run in Basic or higher. NReco PdfGenerator (wkhtmltopdf): requires subscription plan Basic or higher Known issue for all PDF generators based on wkhtmltopdf or phantomjs: custom fonts are not rendered (system-installed font is used instead) because of sandbox GDI API limitations that present even in VM-based Azure Apps plans (Basic or higher).

    Other scenarios that are not supported:

    PhantomJS/Selenium: tries to connect to local address, and also uses GDI+.

    There are some frameworks that do not leverage User32/GDI32 extensively (wkhtmltopdf, for example) and we are working on enabling these in Basic+ the same way we enabled SQL Reporting.

  3. Local Address Requests Connection attempts to local addresses (e.g. localhost, 127.0.0.1) and the machine's own IP will fail, except if another process in the same sandbox has created a listening socket on the destination port.

解决方案是在 Azure VM 上部署您的应用程序,而不是 WebApp。

发布我自己的答案是因为 为我指明了正确的方向,但我找到了不同的解决方案。问题似乎是由写入沙箱中的受保护区域(D:\home 中的任何内容)引起的。 运行 来自 Path.GetTempPath() 的 PhantomJS 并在那里写入文件似乎完全解决了这个问题。

这并不能解释到底发生了什么,但至少问题已经解决了。