将 Step Function 变量传递给 AWS Glue 作业不工作
Pass Step Function variable to AWS Glue Job Not Working
我正在尝试将 AWS Step Function 变量传递给 Glue Job 参数,类似于:
但是,这对我不起作用。粘合作业错误消息表明它正在获取传递的变量名称——而不是变量的实际值。这是我的步进函数代码:
{
"Comment": "Converts CSV files to parquet for a date range.",
"StartAt": "ConfigureCount",
"States": {
"ConfigureCount": {
"Type": "Pass",
"Result": {
"start": 201601,
"end": 201602,
"index": 201601
},
"ResultPath": "$.iterator",
"Next": "Iterator"
},
"Iterator": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456789:function:date-iterator",
"ResultPath": "$.iterator",
"Next": "IsCountReached"
},
"IsCountReached": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.iterator.continue",
"BooleanEquals": true,
"Next": "ConvertToParquet"
}
],
"OutputPath": "$.iterator",
"Default": "Done"
},
"ConvertToParquet": {
"Comment": "Your application logic, to run a specific number of times",
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "convert-to-parquet",
"Arguments": {
"--DATE_RANGE": "$.iterator.index"
}
},
"ResultPath": "$.iterator.index",
"Next": "Iterator"
},
"Done": {
"Type": "Pass",
"End": true
}
}
}
“迭代器”步骤正在调用名为“日期迭代器”的 Lambda,它 returns JSON 类似于以下内容:
{
"start": "201601",
"end": "201602",
"index": "201601"
}
这是基于这篇文章,所以我可以循环遍历值:Iterating a Loop Using Lambda
我的 Step Function 失败,提示“$.iterator.index”不是有效日期。
如何传递这个值,而不是变量名?
来自亚马逊国家语言(https://states-language.net/spec.html):
If any field within the Payload Template (however deeply nested) has a name ending with the characters ".$", its value is transformed according to rules below and the field is renamed to strip the ".$" suffix.
基于添加 .$
应该可以解决您的问题:
"Parameters": {
"JobName": "convert-to-parquet",
"Arguments": {
"--DATE_RANGE.$": "$.iterator.index"
}
},
我正在尝试将 AWS Step Function 变量传递给 Glue Job 参数,类似于:
但是,这对我不起作用。粘合作业错误消息表明它正在获取传递的变量名称——而不是变量的实际值。这是我的步进函数代码:
{
"Comment": "Converts CSV files to parquet for a date range.",
"StartAt": "ConfigureCount",
"States": {
"ConfigureCount": {
"Type": "Pass",
"Result": {
"start": 201601,
"end": 201602,
"index": 201601
},
"ResultPath": "$.iterator",
"Next": "Iterator"
},
"Iterator": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456789:function:date-iterator",
"ResultPath": "$.iterator",
"Next": "IsCountReached"
},
"IsCountReached": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.iterator.continue",
"BooleanEquals": true,
"Next": "ConvertToParquet"
}
],
"OutputPath": "$.iterator",
"Default": "Done"
},
"ConvertToParquet": {
"Comment": "Your application logic, to run a specific number of times",
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "convert-to-parquet",
"Arguments": {
"--DATE_RANGE": "$.iterator.index"
}
},
"ResultPath": "$.iterator.index",
"Next": "Iterator"
},
"Done": {
"Type": "Pass",
"End": true
}
}
}
“迭代器”步骤正在调用名为“日期迭代器”的 Lambda,它 returns JSON 类似于以下内容:
{
"start": "201601",
"end": "201602",
"index": "201601"
}
这是基于这篇文章,所以我可以循环遍历值:Iterating a Loop Using Lambda
我的 Step Function 失败,提示“$.iterator.index”不是有效日期。
如何传递这个值,而不是变量名?
来自亚马逊国家语言(https://states-language.net/spec.html):
If any field within the Payload Template (however deeply nested) has a name ending with the characters ".$", its value is transformed according to rules below and the field is renamed to strip the ".$" suffix.
基于添加 .$
应该可以解决您的问题:
"Parameters": {
"JobName": "convert-to-parquet",
"Arguments": {
"--DATE_RANGE.$": "$.iterator.index"
}
},