使用 aws-sdk-go 将对象上传到 AWS S3 而无需创建文件
Upload object to AWS S3 without creating a file using aws-sdk-go
我正在尝试使用 golang sdk 将对象上传到 AWS S3,而无需在我的系统中创建文件(尝试仅上传字符串)。但我很难做到这一点。谁能给我一个例子,说明如何在不需要创建文件的情况下上传到 AWS S3?
AWS 如何上传文件的示例:
// Creates a S3 Bucket in the region configured in the shared config
// or AWS_REGION environment variable.
//
// Usage:
// go run s3_upload_object.go BUCKET_NAME FILENAME
func main() {
if len(os.Args) != 3 {
exitErrorf("bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0])
}
bucket := os.Args[1]
filename := os.Args[2]
file, err := os.Open(filename)
if err != nil {
exitErrorf("Unable to open file %q, %v", err)
}
defer file.Close()
// Initialize a session in us-west-2 that the SDK will use to load
// credentials from the shared credentials file ~/.aws/credentials.
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-west-2")},
)
// Setup the S3 Upload Manager. Also see the SDK doc for the Upload Manager
// for more information on configuring part size, and concurrency.
//
// http://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/#NewUploader
uploader := s3manager.NewUploader(sess)
// Upload the file's body to S3 bucket as an object with the key being the
// same as the filename.
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
// Can also use the `filepath` standard library package to modify the
// filename as need for an S3 object key. Such as turning absolute path
// to a relative path.
Key: aws.String(filename),
// The file to be uploaded. io.ReadSeeker is preferred as the Uploader
// will be able to optimize memory when uploading large content. io.Reader
// is supported, but will require buffering of the reader's bytes for
// each part.
Body: file,
})
if err != nil {
// Print the error and exit.
exitErrorf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
我已经尝试以编程方式创建文件,但它正在我的系统上创建文件,然后将其上传到 S3。
UploadInput
结构的 Body
字段只是一个 io.Reader
。所以传递任何你想要的 io.Reader
——它不需要是一个文件。
在这个答案中,我将 post 与这个问题相关的所有对我有用的东西。非常感谢@ThunderCat 和@Flimzy 提醒我上传请求的 body 参数已经是 io.Reader。我将 post 一些示例代码评论我从这个问题中学到的东西以及它如何帮助我解决这个问题。也许这会帮助像我和@AlokKumarSingh 这样的人。
情况 1:您已经将数据存储在内存中(例如,从 Kafka、Kinesis 或 SQS 等 streaming/messaging 服务接收数据)
func main() {
if len(os.Args) != 3 {
fmt.Printf(
"bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0],
)
}
bucket := os.Args[1]
filename := os.Args[2]
// this is your data that you have in memory
// in this example it is hard coded but it may come from very distinct
// sources, like streaming services for example.
data := "Hello, world!"
// create a reader from data data in memory
reader := strings.NewReader(data)
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-east-1")},
)
uploader := s3manager.NewUploader(sess)
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(filename),
// here you pass your reader
// the aws sdk will manage all the memory and file reading for you
Body: reader,
})
if err != nil {.
fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
情况 2:您已经有一个持久化文件并且您想要上传它但又不想在内存中保留整个文件:
func main() {
if len(os.Args) != 3 {
fmt.Printf(
"bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0],
)
}
bucket := os.Args[1]
filename := os.Args[2]
// open your file
// the trick here is that the method os.Open just returns for you a reader
// for the desired file, so you will not maintain the whole file in memory.
// I know this might sound obvious, but for a starter (as I was at the time
// of the question) it is not.
fileReader, err := os.Open(filename)
if err != nil {
fmt.Printf("Unable to open file %q, %v", err)
}
defer fileReader.Close()
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-east-1")},
)
uploader := s3manager.NewUploader(sess)
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(filename),
// here you pass your reader
// the aws sdk will manage all the memory and file reading for you
Body: fileReader,
})
if err != nil {
fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
案例 3:这就是我在我的系统的最终版本上实现它的方式,但是为了理解我为什么这样做我必须给你一些背景知识。
我的用例有所改进。上传代码将成为 Lambda 中的一个函数,结果文件变得很大。此更改意味着什么:如果我通过附加到 Lambda 函数的 API 网关中的入口点上传文件,我将不得不等待整个文件在 Lambda 中完成上传。由于 lambda 是根据调用的持续时间和内存使用情况定价的,这可能是一个非常大的问题。
所以,为了解决这个问题,我使用了预签名的 post URL 进行上传。这对 architecture/workflow 有何影响?
我没有从我的后端代码上传到 S3,而是创建并验证了一个 URL 用于 post 将对象 post 发送到后端的 S3,然后将此 URL 发送到前端。这样,我就实现了对 URL 的分段上传。我知道这比问题具体得多,但发现这个解决方案并不容易,所以我认为最好在此处为其他人记录下来。
这里是如何在 nodejs.
中创建预签名 URL 的示例
const AWS = require('aws-sdk');
module.exports.upload = async (event, context, callback) => {
const s3 = new AWS.S3({ signatureVersion: 'v4' });
const body = JSON.parse(event.body);
const params = {
Bucket: process.env.FILES_BUCKET_NAME,
Fields: {
key: body.filename,
},
Expires: 60 * 60
}
let promise = new Promise((resolve, reject) => {
s3.createPresignedPost(params, (err, data) => {
if (err) {
reject(err);
} else {
resolve(data);
}
});
})
return await promise
.then((data) => {
return {
statusCode: 200,
body: JSON.stringify({
message: 'Successfully created a pre-signed post url.',
data: data,
})
}
})
.catch((err) => {
return {
statusCode: 400,
body: JSON.stringify({
message: 'An error occurred while trying to create a pre-signed post url',
error: err,
})
}
});
};
如果想用go也是一样的思路,只需要改de sdk即可。
这是我最后写的
func (s *S3Sink) upload() {
now := time.Now()
key := s.getNewKey(now)
_, err := s.uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(s.bucket),
Key: aws.String(key),
Body: s.bodyBuf,
})
if err != nil {
glog.Errorf("Error uploading %s to s3, %v", key, err)
}
glog.Infof("Uploaded at %s", key)
s.lastUploadTimestamp = now.UnixNano()
s.bodyBuf.Truncate(0)
}
这是我编写的一个小实现,它使用管道并包含超时。
package example
import (
"context"
"fmt"
"io"
"sync"
"time"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
func FileWriter(ctx context.Context, uploader *s3manager.Uploader, wg *sync.WaitGroup, bucket string, key string, timeout time.Duration) (writer *io.PipeWriter) {
// create a per-file flush timeout
fileCtx, cancel := context.WithTimeout(ctx, timeout)
// pipes are open until one end is closed
pr, pw := io.Pipe()
wg.Add(1)
go func() {
params := &s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
Body: pr,
}
// blocking
_, err := uploader.Upload(params)
if err != nil {
fmt.Printf("Unable to upload, %v. Bucket: %s", err, bucket)
}
// always call context cancel functions!
cancel()
wg.Done()
}()
// when context is cancelled, close the pipe
go func() {
<-fileCtx.Done()
// should check fileCtx.Err() here
if err := pw.Close(); err != nil {
fmt.Printf("Unable to close")
}
}()
return pw
}
我正在尝试使用 golang sdk 将对象上传到 AWS S3,而无需在我的系统中创建文件(尝试仅上传字符串)。但我很难做到这一点。谁能给我一个例子,说明如何在不需要创建文件的情况下上传到 AWS S3?
AWS 如何上传文件的示例:
// Creates a S3 Bucket in the region configured in the shared config
// or AWS_REGION environment variable.
//
// Usage:
// go run s3_upload_object.go BUCKET_NAME FILENAME
func main() {
if len(os.Args) != 3 {
exitErrorf("bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0])
}
bucket := os.Args[1]
filename := os.Args[2]
file, err := os.Open(filename)
if err != nil {
exitErrorf("Unable to open file %q, %v", err)
}
defer file.Close()
// Initialize a session in us-west-2 that the SDK will use to load
// credentials from the shared credentials file ~/.aws/credentials.
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-west-2")},
)
// Setup the S3 Upload Manager. Also see the SDK doc for the Upload Manager
// for more information on configuring part size, and concurrency.
//
// http://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/#NewUploader
uploader := s3manager.NewUploader(sess)
// Upload the file's body to S3 bucket as an object with the key being the
// same as the filename.
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
// Can also use the `filepath` standard library package to modify the
// filename as need for an S3 object key. Such as turning absolute path
// to a relative path.
Key: aws.String(filename),
// The file to be uploaded. io.ReadSeeker is preferred as the Uploader
// will be able to optimize memory when uploading large content. io.Reader
// is supported, but will require buffering of the reader's bytes for
// each part.
Body: file,
})
if err != nil {
// Print the error and exit.
exitErrorf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
我已经尝试以编程方式创建文件,但它正在我的系统上创建文件,然后将其上传到 S3。
UploadInput
结构的 Body
字段只是一个 io.Reader
。所以传递任何你想要的 io.Reader
——它不需要是一个文件。
在这个答案中,我将 post 与这个问题相关的所有对我有用的东西。非常感谢@ThunderCat 和@Flimzy 提醒我上传请求的 body 参数已经是 io.Reader。我将 post 一些示例代码评论我从这个问题中学到的东西以及它如何帮助我解决这个问题。也许这会帮助像我和@AlokKumarSingh 这样的人。
情况 1:您已经将数据存储在内存中(例如,从 Kafka、Kinesis 或 SQS 等 streaming/messaging 服务接收数据)
func main() {
if len(os.Args) != 3 {
fmt.Printf(
"bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0],
)
}
bucket := os.Args[1]
filename := os.Args[2]
// this is your data that you have in memory
// in this example it is hard coded but it may come from very distinct
// sources, like streaming services for example.
data := "Hello, world!"
// create a reader from data data in memory
reader := strings.NewReader(data)
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-east-1")},
)
uploader := s3manager.NewUploader(sess)
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(filename),
// here you pass your reader
// the aws sdk will manage all the memory and file reading for you
Body: reader,
})
if err != nil {.
fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
情况 2:您已经有一个持久化文件并且您想要上传它但又不想在内存中保留整个文件:
func main() {
if len(os.Args) != 3 {
fmt.Printf(
"bucket and file name required\nUsage: %s bucket_name filename",
os.Args[0],
)
}
bucket := os.Args[1]
filename := os.Args[2]
// open your file
// the trick here is that the method os.Open just returns for you a reader
// for the desired file, so you will not maintain the whole file in memory.
// I know this might sound obvious, but for a starter (as I was at the time
// of the question) it is not.
fileReader, err := os.Open(filename)
if err != nil {
fmt.Printf("Unable to open file %q, %v", err)
}
defer fileReader.Close()
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-east-1")},
)
uploader := s3manager.NewUploader(sess)
_, err = uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(filename),
// here you pass your reader
// the aws sdk will manage all the memory and file reading for you
Body: fileReader,
})
if err != nil {
fmt.Printf("Unable to upload %q to %q, %v", filename, bucket, err)
}
fmt.Printf("Successfully uploaded %q to %q\n", filename, bucket)
}
案例 3:这就是我在我的系统的最终版本上实现它的方式,但是为了理解我为什么这样做我必须给你一些背景知识。
我的用例有所改进。上传代码将成为 Lambda 中的一个函数,结果文件变得很大。此更改意味着什么:如果我通过附加到 Lambda 函数的 API 网关中的入口点上传文件,我将不得不等待整个文件在 Lambda 中完成上传。由于 lambda 是根据调用的持续时间和内存使用情况定价的,这可能是一个非常大的问题。
所以,为了解决这个问题,我使用了预签名的 post URL 进行上传。这对 architecture/workflow 有何影响?
我没有从我的后端代码上传到 S3,而是创建并验证了一个 URL 用于 post 将对象 post 发送到后端的 S3,然后将此 URL 发送到前端。这样,我就实现了对 URL 的分段上传。我知道这比问题具体得多,但发现这个解决方案并不容易,所以我认为最好在此处为其他人记录下来。
这里是如何在 nodejs.
中创建预签名 URL 的示例const AWS = require('aws-sdk');
module.exports.upload = async (event, context, callback) => {
const s3 = new AWS.S3({ signatureVersion: 'v4' });
const body = JSON.parse(event.body);
const params = {
Bucket: process.env.FILES_BUCKET_NAME,
Fields: {
key: body.filename,
},
Expires: 60 * 60
}
let promise = new Promise((resolve, reject) => {
s3.createPresignedPost(params, (err, data) => {
if (err) {
reject(err);
} else {
resolve(data);
}
});
})
return await promise
.then((data) => {
return {
statusCode: 200,
body: JSON.stringify({
message: 'Successfully created a pre-signed post url.',
data: data,
})
}
})
.catch((err) => {
return {
statusCode: 400,
body: JSON.stringify({
message: 'An error occurred while trying to create a pre-signed post url',
error: err,
})
}
});
};
如果想用go也是一样的思路,只需要改de sdk即可。
这是我最后写的
func (s *S3Sink) upload() {
now := time.Now()
key := s.getNewKey(now)
_, err := s.uploader.Upload(&s3manager.UploadInput{
Bucket: aws.String(s.bucket),
Key: aws.String(key),
Body: s.bodyBuf,
})
if err != nil {
glog.Errorf("Error uploading %s to s3, %v", key, err)
}
glog.Infof("Uploaded at %s", key)
s.lastUploadTimestamp = now.UnixNano()
s.bodyBuf.Truncate(0)
}
这是我编写的一个小实现,它使用管道并包含超时。
package example
import (
"context"
"fmt"
"io"
"sync"
"time"
"github.com/aws/aws-sdk-go/service/s3/s3manager"
)
func FileWriter(ctx context.Context, uploader *s3manager.Uploader, wg *sync.WaitGroup, bucket string, key string, timeout time.Duration) (writer *io.PipeWriter) {
// create a per-file flush timeout
fileCtx, cancel := context.WithTimeout(ctx, timeout)
// pipes are open until one end is closed
pr, pw := io.Pipe()
wg.Add(1)
go func() {
params := &s3manager.UploadInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
Body: pr,
}
// blocking
_, err := uploader.Upload(params)
if err != nil {
fmt.Printf("Unable to upload, %v. Bucket: %s", err, bucket)
}
// always call context cancel functions!
cancel()
wg.Done()
}()
// when context is cancelled, close the pipe
go func() {
<-fileCtx.Done()
// should check fileCtx.Err() here
if err := pw.Close(); err != nil {
fmt.Printf("Unable to close")
}
}()
return pw
}