如何打印出 spark.sql 对象?
How do I print out a spark.sql object?
我有一个包含几个变量的 spark.sql 对象。
import com.github.nscala_time.time.Imports.LocalDate
val first_date = new LocalDate(2020, 4, 1)
val second_date = new LocalDate(2020, 4, 7)
val mydf = spark.sql(s"""
select *
from tempView
where timestamp between '{0}' and '{1}'
""".format(start_date.toString, end_date.toString))
我想打印 mydf
因为我 运行 mydf.count
结果是 0。
我运行mydf
回来了mydf: org.apache.spark.sql.DataFrame = [column: type]
我也试过 println(mydf)
但它没有 return 查询。
有这个相关,但是没有答案
如何打印查询?
最简单的方法是将查询存储到 variable
中,然后 print 输出变量以获取查询。
- 在
spark.sql
中使用variable
Example:
In Spark-scala:
val start_date="2020-01-01"
val end_date="2020-02-02"
val query=s"""select * from tempView where timestamp between'${start_date}' and '${end_date}'"""
print (query)
//select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
spark.sql(query)
In Pyspark:
start_date="2020-01-01"
end_date="2020-02-02"
query="""select * from tempView where timestamp between'{0}' and '{1}'""".format(start_date,end_date)
print(query)
#select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
#use same query in spark.sql
spark.sql(query)
在 PySpark 中。
start_date="2020-01-01"
end_date="2020-02-02"
q="select * from tempView where timestamp between'{0}' and '{1}'".format(start_date,end_date)
print(q)
我有一个包含几个变量的 spark.sql 对象。
import com.github.nscala_time.time.Imports.LocalDate
val first_date = new LocalDate(2020, 4, 1)
val second_date = new LocalDate(2020, 4, 7)
val mydf = spark.sql(s"""
select *
from tempView
where timestamp between '{0}' and '{1}'
""".format(start_date.toString, end_date.toString))
我想打印 mydf
因为我 运行 mydf.count
结果是 0。
我运行mydf
回来了mydf: org.apache.spark.sql.DataFrame = [column: type]
我也试过 println(mydf)
但它没有 return 查询。
有这个相关
如何打印查询?
最简单的方法是将查询存储到 variable
中,然后 print 输出变量以获取查询。
- 在
spark.sql
中使用
variable
Example:
In Spark-scala:
val start_date="2020-01-01"
val end_date="2020-02-02"
val query=s"""select * from tempView where timestamp between'${start_date}' and '${end_date}'"""
print (query)
//select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
spark.sql(query)
In Pyspark:
start_date="2020-01-01"
end_date="2020-02-02"
query="""select * from tempView where timestamp between'{0}' and '{1}'""".format(start_date,end_date)
print(query)
#select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
#use same query in spark.sql
spark.sql(query)
在 PySpark 中。
start_date="2020-01-01"
end_date="2020-02-02"
q="select * from tempView where timestamp between'{0}' and '{1}'".format(start_date,end_date)
print(q)