如何在 mlrun.feature_store 中使用 get_offline_features()?
how to use get_offline_features() in the mlrun.feature_store?
我正在尝试从现有的功能商店获取功能。
在文档 https://docs.mlrun.org/en/latest/api/mlrun.feature_store.html 中,它说您可以将特征向量 uri 或 FeatureVector 对象传递给 mlrun.feature_store.get_offline_features()
。
特征存储的 uri 是什么?
在哪里可以找到示例?
在 MLRun 中,特征集是一组一起摄取的特征。特征向量是从特征集中选择的特征(这里有几列,那里有几列,等等)。这非常适合使用通用 entity/key.
将多个数据源连接在一起可以在下面找到从 MLRun 创建和查询特征集的完整示例:
import mlrun.feature_store as fs
from mlrun import set_environment
import pandas as pd
# Set project - for retrieving features later
set_environment(project="my-project")
# Feature set to ingest
df = pd.DataFrame({
"key" : [0, 1, 2, 3],
"value" : ["A", "B", "C", "D"]
})
# Create feature set with desired name and entity/key
fset = fs.FeatureSet("my-feature-set", entities=[fs.Entity("key")])
# Ingest
fs.ingest(featureset=fset, source=df)
# Create feature vector (allows for joining multiple feature sets together)
features = ["my-feature-set.*"] # can also do ["my-feature-set.A", my-feature-set.B", ...]
vector = fs.FeatureVector("my-feature-vector", features)
# Retrieve offline features (vector object)
fs.get_offline_features(vector)
# Retrieve offline features (project + name)
fs.get_offline_features("my-project/my-feature-vector")
# Retrieve offline features as pandas dataframe
fs.get_offline_features("my-project/my-feature-vector").to_dataframe()
您可以在此处的文档中找到更多特征存储示例:https://docs.mlrun.org/en/latest/feature-store/feature-store.html