在 Synapse 中声明 Pyspark 变量并在 Kusto 查询中使用它

Declare Pyspark variable in Synapse and use it in Kusto query

我想在 Synapse 中声明 Pyspark 变量并在 Kusto 查询中使用该变量。

Pyspark中声明的变量如下

s = "02-01-2022"
print(s)
e = "02-10-2022"
print(e)

想要在 Kusto 查询中使用变量“s”和“e”,如下所示

%%pyspark

s = "02-01-2022" 
print(s)
e = "02-10-2022"
print(e)

# Read data from Azure Data Explorer table(s)
# Full Sample Code available at: https://github.com/Azure/azure-kusto-spark/blob/master/samples/src/main/python/SynapseSample.py

sales_data  = spark.read \
    .format("com.microsoft.kusto.spark.synapse.datasource") \
    .option("spark.synapse.linkedService", "LinkedServiceName") \
    .option("kustoDatabase", "DatabaseName") \
    .option("kustoQuery", "let starttime = startofday(todatetime('s')); let endtime = startofday(todatetime('e')); Table | where Time between (starttime .. endtime)  | summarize amount = count() by Date= bin(TIMESTAMP,5h) | project Date,amount | order by Date asc") \
    .load()

display(sales_data)

您可以在pyspark中通过以下方式使用变量:

option("kustoQuery", "let starttime = startofday(todatetime('" + s + "')); let endtime = startofday(todatetime('" + e + "')); Table | where Time between (starttime .. endtime)  | summarize amount = count() by Date= bin(TIMESTAMP,5h) | project Date,amount | order by Date asc")

另外,请参考 Apache Spark 的 Azure Data Explorer (Kusto) 连接器