将特征定义保存在字典中并将特征返回给客户
Keep Feature Definitions in Dictionary and Rerturn the feature to the client
我有以下字典用于将特征定义保存为字符串。
features = {
"journey_email_been_sent_flag": "F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0))",
"journey_opened_flag": "F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))"
}
retrieved_features = {}
non_retrieved_features = {}
或者保留它本身的定义。
features = {
"journey_email_been_sent_flag": F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0)),
"journey_opened_flag": F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))
}
下面是检索特征定义的代码
def feature_extract(*featurenames):
for featurename in featurenames:
if featurename in features:
print(f"{featurename} : {features[featurename]}")
retrieved_features[featurename] = features[featurename]
else:
print('failure')
non_retrieved_features[featurename] = "Not Found in the feature defenition"
return retrieved_features
这就是我调用检索特征的函数的方式
feature_extract('journey_email_been_sent_flag','journey_opened_flag')
但是当我试图检索未来时它不起作用,当我在字典中保留定义时收到以下结果
Out[19]: {'journey_email_been_sent_flag': Column<b'CASE WHEN (email_14days > 0) THEN 1 ELSE 0 END'>}
当我在数据框中调用如下特征检索时。
.withColumn('journey_email_been_sent_flag', feature_extract('journey_email_been_sent_flag'))
低于错误
AssertionError: col should be Column
我可以通过这种方式修复它
我将特征定义保留为定义
features = {
"journey_email_been_sent_flag": F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0)),
"journey_opened_flag": F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))
}
并使用 F.lit
调用 feature_extract 函数
F.lit(feature_extract('journey_email_been_sent_flag').get('journey_email_been_sent_flag'))
我有以下字典用于将特征定义保存为字符串。
features = {
"journey_email_been_sent_flag": "F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0))",
"journey_opened_flag": "F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))"
}
retrieved_features = {}
non_retrieved_features = {}
或者保留它本身的定义。
features = {
"journey_email_been_sent_flag": F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0)),
"journey_opened_flag": F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))
}
下面是检索特征定义的代码
def feature_extract(*featurenames):
for featurename in featurenames:
if featurename in features:
print(f"{featurename} : {features[featurename]}")
retrieved_features[featurename] = features[featurename]
else:
print('failure')
non_retrieved_features[featurename] = "Not Found in the feature defenition"
return retrieved_features
这就是我调用检索特征的函数的方式
feature_extract('journey_email_been_sent_flag','journey_opened_flag')
但是当我试图检索未来时它不起作用,当我在字典中保留定义时收到以下结果
Out[19]: {'journey_email_been_sent_flag': Column<b'CASE WHEN (email_14days > 0) THEN 1 ELSE 0 END'>}
当我在数据框中调用如下特征检索时。
.withColumn('journey_email_been_sent_flag', feature_extract('journey_email_been_sent_flag'))
低于错误
AssertionError: col should be Column
我可以通过这种方式修复它
我将特征定义保留为定义
features = {
"journey_email_been_sent_flag": F.when(F.col('email_14days') > 0,F.lit(1)).otherwise(F.lit(0)),
"journey_opened_flag": F.when(F.col('opened_14days') > 0, F.lit(1)).otherwise(F.lit(0))
}
并使用 F.lit
调用 feature_extract 函数F.lit(feature_extract('journey_email_been_sent_flag').get('journey_email_been_sent_flag'))