如何制作 FeatureUnion return 数据框
How to make FeatureUnion return Dataframe
所以我目前有一个有很多客户转换器的管道:
p = Pipeline([
("GetTimeFromDate",TimeTransformer("Date")), #Custom Transformer that adds ["time"] column
("GetZipFromAddress",ZipTransformer("Address")), #Custom Transformer that adds ["zip"] column
("GroupByTimeandZip",GroupByTransformer(["time","zip"]) #Custom Transformer that adds onehot columns
])
每个转换器接收一个 pandas 数据帧和 return 具有一个或多个新列的相同数据帧。它实际上工作得很好,但我如何才能 运行 "GetTimeFromDate" 和 "GetZipFromAddress" 并行执行?
我想使用 FeatureUnion:
f = FeatureUnion([
("GetTimeFromDate",TimeTransformer("Date")), #Custom Transformer that adds ["time"] column
("GetZipFromAddress",ZipTransformer("Address")), #Custom Transformer that adds ["zip"] column])
])
p = Pipeline([
("FeatureUnionStep",f),
("GroupByTimeandZip",GroupByTransformer(["time","zip"]) #Custom Transformer that adds onehot columns
])
但问题是FeatureUnion return是一个numpy.ndarray,但是"GroupByTimeandZip"这一步需要一个dataframe。
有没有办法让 FeatureUnion 到 return pandas 数据框?
所以我目前有一个有很多客户转换器的管道:
p = Pipeline([
("GetTimeFromDate",TimeTransformer("Date")), #Custom Transformer that adds ["time"] column
("GetZipFromAddress",ZipTransformer("Address")), #Custom Transformer that adds ["zip"] column
("GroupByTimeandZip",GroupByTransformer(["time","zip"]) #Custom Transformer that adds onehot columns
])
每个转换器接收一个 pandas 数据帧和 return 具有一个或多个新列的相同数据帧。它实际上工作得很好,但我如何才能 运行 "GetTimeFromDate" 和 "GetZipFromAddress" 并行执行?
我想使用 FeatureUnion:
f = FeatureUnion([
("GetTimeFromDate",TimeTransformer("Date")), #Custom Transformer that adds ["time"] column
("GetZipFromAddress",ZipTransformer("Address")), #Custom Transformer that adds ["zip"] column])
])
p = Pipeline([
("FeatureUnionStep",f),
("GroupByTimeandZip",GroupByTransformer(["time","zip"]) #Custom Transformer that adds onehot columns
])
但问题是FeatureUnion return是一个numpy.ndarray,但是"GroupByTimeandZip"这一步需要一个dataframe。
有没有办法让 FeatureUnion 到 return pandas 数据框?