如何通过乘以另外两个 tf.feature_column 来创建 tf.feature_column?
How to create a tf.feature_column by multiplying two other tf.feature_columns?
Tensorflow中已经有跨列创建特征的功能tf.feature_column.crossed_column
,但更多的是针对类别数据。数字数据呢?
例如,已经有2列
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
如果我想根据年龄和 education_num 创建第三个和第四个特征列,就像这样
my_feature = age * education_num
my_another_feature = age * age
如何做到?
您可以声明自定义数值列并将其添加到 input function 中的数据框:
# Existing features
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
# Declare a custom column just like other columns
my_feature = tf.feature_column.numeric_column("my_feature")
...
# Add to the list of features
feature_columns = { ... age, education_num, my_feature, ... }
...
def input_fn():
df_data = pd.read_csv("input.csv")
df_data = df_data.dropna(how="any", axis=0)
# Manually update the dataframe
df_data["my_feature"] = df_data["age"] * df_data["education_num"]
return tf.estimator.inputs.pandas_input_fn(x=df_data,
y=labels,
batch_size=100,
num_epochs=10)
...
model.train(input_fn=input_fn())
Tensorflow中已经有跨列创建特征的功能tf.feature_column.crossed_column
,但更多的是针对类别数据。数字数据呢?
例如,已经有2列
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
如果我想根据年龄和 education_num 创建第三个和第四个特征列,就像这样
my_feature = age * education_num
my_another_feature = age * age
如何做到?
您可以声明自定义数值列并将其添加到 input function 中的数据框:
# Existing features
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
# Declare a custom column just like other columns
my_feature = tf.feature_column.numeric_column("my_feature")
...
# Add to the list of features
feature_columns = { ... age, education_num, my_feature, ... }
...
def input_fn():
df_data = pd.read_csv("input.csv")
df_data = df_data.dropna(how="any", axis=0)
# Manually update the dataframe
df_data["my_feature"] = df_data["age"] * df_data["education_num"]
return tf.estimator.inputs.pandas_input_fn(x=df_data,
y=labels,
batch_size=100,
num_epochs=10)
...
model.train(input_fn=input_fn())