Azure ML 中的参数化 SQL 查询
Parameterized SQL query in Azure ML
背景:似乎有一种方法可以用 PipelineParameter
参数化 DataPath
https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-showcasing-datapath-and-pipelineparameter.ipynb
但我想使用 PipelineParameter 参数化我的 SQL 查询,例如,使用此查询
sql_query = """
SELECT id, foo, bar FROM baz
WHERE baz.id BETWEEN 10 AND 20
"""
dataset = Dataset.Tabular.from_sql_query((sql_datastore, sql_query))
我想使用 PipelineParameter 将 10
和 20
参数化为 param_1
和 param_2
。这可能吗?
找到了一种方法:
将您的参数传递给 PythonScriptStep
param_1 = PipelineParameter(name='min_id', default_value=5)
param_2 = PipelineParameter(name='max_id', default_value=10)
sql_datastore = "sql_datastore"
step = PythonScriptStep(script_name='script.py', arguments=[param_1, param_2,
sql_datastore])
在script.py
min_id_param = sys.argv[1]
max_id_param = sys.argv[2]
sql_datastore_name = sys.argv[3]
query = """
SELECT id, foo, bar FROM baz
WHERE baz.id BETWEEN {} AND {}
""".format(min_id_param, max_id_param)
run = Run.get_context()
sql_datastore = Datastore.get(run.experiment.workspace, sql_datastore_name)
dataset = Dataset.Tabular.from_sql_query((sql_datastore, query))
背景:似乎有一种方法可以用 PipelineParameter
参数化 DataPath
https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-showcasing-datapath-and-pipelineparameter.ipynb
但我想使用 PipelineParameter 参数化我的 SQL 查询,例如,使用此查询
sql_query = """
SELECT id, foo, bar FROM baz
WHERE baz.id BETWEEN 10 AND 20
"""
dataset = Dataset.Tabular.from_sql_query((sql_datastore, sql_query))
我想使用 PipelineParameter 将 10
和 20
参数化为 param_1
和 param_2
。这可能吗?
找到了一种方法:
将您的参数传递给 PythonScriptStep
param_1 = PipelineParameter(name='min_id', default_value=5)
param_2 = PipelineParameter(name='max_id', default_value=10)
sql_datastore = "sql_datastore"
step = PythonScriptStep(script_name='script.py', arguments=[param_1, param_2,
sql_datastore])
在script.py
min_id_param = sys.argv[1]
max_id_param = sys.argv[2]
sql_datastore_name = sys.argv[3]
query = """
SELECT id, foo, bar FROM baz
WHERE baz.id BETWEEN {} AND {}
""".format(min_id_param, max_id_param)
run = Run.get_context()
sql_datastore = Datastore.get(run.experiment.workspace, sql_datastore_name)
dataset = Dataset.Tabular.from_sql_query((sql_datastore, query))