AWS Glue SelectFields 和 Filter 不采用动态值
AWS Glue SelectFields and Filter not taking dynamic values
我编写了一个 AWS Glue 脚本,它使用 SelectFields() 和 Filter() 方法进行字段选择和过滤。我已经用静态值测试了这些并且工作正常,但是,当以相同格式传递动态值时它们不起作用。知道为什么不采用动态值吗?
我通过传递动态值之一进行了测试,对于这种情况,两种方法都有效。
另请注意,传递的密钥 (filterkey) 无论是静态的还是动态的都有效
wordstoFilter = ['USA', 'France']
columnstoSelect = ['cust_id', 'custname', 'state']
#join and return all list values in single quote along with comma
fltr_string =', '.join(["'{}'".format(value) for value in wordstoFilter])
select_string =', '.join(["'{}'".format(value) for value in columnstoSelect ])
filterkey = "country"
#below statement works with static value
#country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x["country"] in ["USA"]))
country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x[filterkey] in [fltr_string]))
##Select case
#below statement works with static value
#selected_fields_dyf = SelectFields.apply(frame = custData, paths = ['cust_id', 'cust_name', 'state', 'country'])
#Below one doesn't work
selected_dyf = SelectFields.apply(frame = custData, paths = [select_string ])
正如我所见,paths 参数希望你给出一个列表,但你给出了一个 str 对象:
>>> type(['cust_id', 'cust_name', 'state', 'country'])
<class 'list'>
>>> type(select_string)
<class 'str'>
你试过直接给名单吗?
>>> type(columnstoSelect)
<class 'list'>
columnstoSelect = ['cust_id', 'custname', 'state']
selected_dyf = SelectFields.apply(frame = custData, paths = columnstoSelect )
我编写了一个 AWS Glue 脚本,它使用 SelectFields() 和 Filter() 方法进行字段选择和过滤。我已经用静态值测试了这些并且工作正常,但是,当以相同格式传递动态值时它们不起作用。知道为什么不采用动态值吗? 我通过传递动态值之一进行了测试,对于这种情况,两种方法都有效。
另请注意,传递的密钥 (filterkey) 无论是静态的还是动态的都有效
wordstoFilter = ['USA', 'France']
columnstoSelect = ['cust_id', 'custname', 'state']
#join and return all list values in single quote along with comma
fltr_string =', '.join(["'{}'".format(value) for value in wordstoFilter])
select_string =', '.join(["'{}'".format(value) for value in columnstoSelect ])
filterkey = "country"
#below statement works with static value
#country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x["country"] in ["USA"]))
country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x[filterkey] in [fltr_string]))
##Select case
#below statement works with static value
#selected_fields_dyf = SelectFields.apply(frame = custData, paths = ['cust_id', 'cust_name', 'state', 'country'])
#Below one doesn't work
selected_dyf = SelectFields.apply(frame = custData, paths = [select_string ])
正如我所见,paths 参数希望你给出一个列表,但你给出了一个 str 对象:
>>> type(['cust_id', 'cust_name', 'state', 'country'])
<class 'list'>
>>> type(select_string)
<class 'str'>
你试过直接给名单吗?
>>> type(columnstoSelect)
<class 'list'>
columnstoSelect = ['cust_id', 'custname', 'state']
selected_dyf = SelectFields.apply(frame = custData, paths = columnstoSelect )