提供索引文件而不是下载提示

Question

我将我的网站托管在 S3 上，并将 CloudFront 作为 CDN，我需要这两个 URL 具有相同的行为并为目录中的 index.html 文件提供服务：

example.com/directory example.com/directory/

末尾带有 / 的文件会错误地提示浏览器下载一个零字节文件，该文件的名称具有随机哈希值。没有斜杠 returns 我的 404 页面。

如何获得在目录中传递 index.html 文件的两个路径？

如果我 "supposed" 有办法做到这一点，那太好了！这就是我所希望的，但如果不是，我可能会尝试使用 Lambda@Edge 进行重定向。无论如何，我在其他一些情况下都需要它，所以一些关于如何从 Lambda@Edge 进行 301 或 302 重定向的说明也会有所帮助:)

更新（根据 John Hanley 的评论）

curl -i https://www.example.com/directory/

HTTP/2 200 
content-type: application/x-directory
content-length: 0
date: Sat, 12 Jan 2019 22:07:47 GMT
last-modified: Wed, 31 Jan 2018 00:44:16 GMT
etag: "[id]"
accept-ranges: bytes
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 [id].cloudfront.net (CloudFront)
x-amz-cf-id: [id]

更新

CloudFront 设置了一种行为，将 http 转发到 https 并将请求发送到 S3。它还在错误选项卡下有一个 404 错误路由。

Answer 1

这种类型的行为通常是 controlled/caused 由您的 HTTP(s) header 数据引起的，具体来说，是您的客户端收到的 Content-Type。

Inspect the header 并尝试调整从您的服务器返回的内容。这应该会导致您的解决方案。

In Chrome, visit a URL, right click, select Inspect to open the developer tools.

Select Network tab.

Reload the page, select any HTTP request on the left panel, and the HTTP headers will be displayed on the right panel.

Answer 2

S3 仅在您启用并使用存储桶的网站托管功能时通过指向存储桶的网站托管端点 ${bucket}.s3-website.${region}.amazonaws.com 而不是通用 REST 端点来提供自动索引文档桶，${bucket}.s3.amazonaws.com.

网站端点和 REST 端点有 numerous differences，包括这个。

您看到这些以 / 结尾的对象键的 0 字节文件的原因是因为您正在使用 S3 控制台或实际创建 0 字节的其他实用程序在存储桶中创建文件夹对象对象。不需要它们，一旦文件夹有对象 "in" 它们——但它们是在 S3 控制台中显示空文件夹的唯一方法，它将名为 foo/ 的对象显示为名为foo，即使没有其他键前缀为 foo/ 的对象。它是控制台中文件夹层次结构的可视化仿真的一部分，即使 S3 中的对象从来都不是真正的 "in" 文件夹。

如果出于某种原因您需要使用 REST 端点——例如您不想创建存储桶 public——那么您需要在 CloudFront 中使用两个 Lambda@Edge 触发器来模拟这个功能相当接近。

Origin Request 触发器可以在检查 CloudFront 缓存之后检查和修改请求，然后再将请求发送到源。我们使用它来检查以 / 结尾的路径，如果找到则追加 index.html。

Origin Response 触发器可以在将响应写入 CloudFront 缓存之前检查并可能修改响应。 Origin Response 触发器还可以检查在生成响应的请求之前的原始请求。我们用它来检查响应是否错误。如果是，并且原始请求 not 似乎是针对索引文档或文件（具体来说，在路径中的最后一个斜杠之后，"file" 至少有一个字符，后跟一个点，再接着至少一个字符——如果是这样，那可能是 "file"）。如果两者都不是，我们将重定向到原始路径加上我们附加的最终 / 。

Origin Request 和 Origin Response 触发器仅在缓存未命中时触发。当缓存命中时，两个触发器都不会触发，因为它们位于 CloudFront 的原始端——缓存的背面。可以从缓存中提供服务的请求从缓存中提供服务，因此不会调用触发器。

以下是在Node.js8.10中编写的Lambda@Edge函数。这个 Lambda 函数修改了它的行为，以便它根据上下文表现得像原始请求或原始响应。在 Lambda 中发布版本后，将该版本的 ARN 与 CloudFront 缓存行为设置相关联，作为原始请求和原始响应触发器。

'use strict';

// combination origin-request, origin-response trigger to emulate the S3
// website hosting index document functionality, while using the REST
// endpoint for the bucket

// 

const INDEX_DOCUMENT = 'index.html'; // do not prepend a slash to this value

const HTTP_REDIRECT_CODE = '302'; // or use 301 or another code if desired
const HTTP_REDIRECT_MESSAGE = 'Found'; 

exports.handler = (event, context, callback) => {
    const cf = event.Records[0].cf;

    if(cf.config.eventType === 'origin-request')
    {
        // if path ends with '/' then append INDEX_DOCUMENT before sending to S3
        if(cf.request.uri.endsWith('/'))
        {
            cf.request.uri = cf.request.uri + INDEX_DOCUMENT;
        }
        // return control to CloudFront, to send request to S3, whether or not
        // we modified it; if we did, the modified URI will be requested.
        return callback(null, cf.request);
    }
    else if(cf.config.eventType === 'origin-response')
    {
        // is the response 403 or 404?  If not, we will return it unchanged.
        if(cf.response.status.match(/^40[34]$/))
        {
            // it's an error.

            // we're handling a response, but Lambda@Edge can still see the attributes of the request that generated this response; so, we
            // check whether this is a page that should be redirected with a trailing slash appended.  If it doesn't look like an index
            // document request, already, and it doesn't end in a slash, and doesn't look like a filename with an extension... we'll try that.

            // This is essentially what the S3 web site endpoint does if you hit a nonexistent key, so that the browser requests
            // the index with the correct relative path, except that S3 checks whether it will actually work.  We are using heuristics,
            // rather than checking the bucket, but checking is an alternative.

            if(!cf.request.uri.endsWith('/' + INDEX_DOCUMENT) && // not a failed request for an index document
               !cf.request.uri.endsWith('/') && // unlikely, unless this code is modified to pass other things through on the request side
               !cf.request.uri.match(/[^\/]+\.[^\/]+$/)) // doesn't look like a filename  with an extension
            {
                // add the original error to the response headers, for reference/troubleshooting
                cf.response.headers['x-redirect-reason'] = [{ key: 'X-Redirect-Reason', value: cf.response.status + ' ' + cf.response.statusDescription }];
                // set the redirect code
                cf.response.status = HTTP_REDIRECT_CODE;
                cf.response.statusDescription = HTTP_REDIRECT_MESSAGE;
                // set the Location header with the modified URI
                // just append the '/', not the "index.html" -- the next request will trigger
                // this function again, and it will be added without appearing in the
                // browser's address bar.
                cf.response.headers['location'] = [{ key: 'Location', value: cf.request.uri + '/' }];
                // not strictly necessary, since browsers don't display it, but remove the response body with the S3 error XML in it
                cf.response.body = '';
            }
        }

        // return control to CloudFront, with either the original response, or
        // the modified response, if we modified it.

        return callback(null, cf.response);

    }
    else // this is not intended as a viewer-side trigger.  Throw an exception, visible only in the Lambda CloudWatch logs and a 502 to the browser.
    {
        return callback(`Lambda function is incorrectly configured; triggered on '${cf.config.eventType}' but expected 'origin-request' or 'origin-response'`);
    }

};

Answer 3

给出的答案是错误的。 Cloudfront 有自己的配置来让 www.yourdomain.com/ 提供文档。它被称为 "default root object"，它的配置可以在您的云端发行版的 "general" 选项卡下找到。以下是获取 SSL/https-enabled 自定义域 + 云端 + s3 存储桶的完整步骤。

创建一个具有默认（关闭）权限的全新 S3 存储桶或从目标存储桶中删除所有 public 访问权限。
禁用静态网站托管。你不需要它。
如果您还没有，请将您的 SSL 证书放入 Amazon，以便您可以将其附加到将指向您的 S3 存储桶的云端分发。
使用证书创建指向目标 S3 存储桶的云端分发。
对于源配置，使用 www.yourdomain.com.s3.amazonaws.com 形式作为源，而不是静态网站托管 URL（无论如何都应该禁用）。
让云端配置自动更改 S3 存储桶访问权限 ("restrict bucket access")。您希望访问仅限于此云端分布的存储桶（通过特定身份）。没有人应该直接访问您的 S3 存储桶，尤其是因为它可以通过 http 提供服务（否 "s"）。
在云端 "general" 选项卡下（或在设置期间）将您的默认根对象设置为 "index.html" 或其他。否则，对 https://www.yourdomain.com/ 的请求将显示权限被拒绝。

Answer 4

最近 AWS 最近推出了 CloudFront Functions，可用于此用例。 CloudFront 函数 与 Lambda@Edge 相比，更便宜、更快速且更易于实施和测试。

如果在访问路径时未提供请求，下面是将 index.html 附加到请求的示例函数。

function handler(event) {
    var request = event.request;
    var uri = request.uri;
    
    // Check whether the URI is missing a file name.
    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } 
    // Check whether the URI is missing a file extension.
    else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }

    return request;
}

这不会在 Web 浏览器地址栏中附加 index.html，从而在浏览时提供更清晰的 URL。在您的情况下，https://www.example.com/directory/ 在浏览时将保持原样，但会呈现 https://www.example.com/directory/index.html.

的内容

可以在 https://github.com/aws-samples/amazon-cloudfront-functions/blob/main/url-rewrite-single-page-apps/index.js

中找到更多示例

提供索引文件而不是下载提示

Serve index file instead of download prompt

html

amazon-web-services

amazon-s3

aws-lambda

amazon-cloudfront