pyspark-雪花无法从 table 加载数据
pyspark- snowflake unable to load data from table
我正在尝试使用以下代码在胶水中使用 pyspark 从雪花中查询数据
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from py4j.java_gateway import java_import
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
## @params: [JOB_NAME, URL, ACCOUNT, WAREHOUSE, DB, SCHEMA, USERNAME, PASSWORD]
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'URL', 'ACCOUNT', 'WAREHOUSE', 'DB', 'SCHEMA', 'USERNAME', 'PASSWORD'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
java_import(spark._jvm, "net.snowflake.spark.snowflake")
## uj = sc._jvm.net.snowflake.spark.snowflake
spark._jvm.net.snowflake.spark.snowflake.SnowflakeConnectorUtils.enablePushdownSession(spark._jvm.org.apache.spark.sql.SparkSession.builder().getOrCreate())
table="crx--54gfg--hyg65hghg76768_t6ghh75y"
options = {
"sfURL" : args['URL'],
"sfAccount" : args['ACCOUNT'],
"sfUser" : args['USERNAME'],
"sfPassword" : args['PASSWORD'],
"sfDatabase" : args['DB'],
"sfSchema" : args['SCHEMA'],
"sfWarehouse" : args['WAREHOUSE'],
}
query=f"select * from {table}"
df = spark.read \
.format(SNOWFLAKE_SOURCE_NAME ) \
.options(**options) \
.option("query", query) \
.load()
display(df)
但我遇到了以下错误
net.snowflake.client.jdbc.snowflakesqlexception sql compilation error syntax error line 1 at position 111 unexpected '<EOF>'
我认为这主要是因为 db table 名称有一些特殊字符。
如何解决这个错误?
使用:
query=f'''select * from "{table}"'''
或:
table='''"crx--54gfg--hyg65hghg76768_t6ghh75y"'''
...
query=f"select * from {table}"
table 名称包含 -
,应该用 "
括起来。
我正在尝试使用以下代码在胶水中使用 pyspark 从雪花中查询数据
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from py4j.java_gateway import java_import
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
## @params: [JOB_NAME, URL, ACCOUNT, WAREHOUSE, DB, SCHEMA, USERNAME, PASSWORD]
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'URL', 'ACCOUNT', 'WAREHOUSE', 'DB', 'SCHEMA', 'USERNAME', 'PASSWORD'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
java_import(spark._jvm, "net.snowflake.spark.snowflake")
## uj = sc._jvm.net.snowflake.spark.snowflake
spark._jvm.net.snowflake.spark.snowflake.SnowflakeConnectorUtils.enablePushdownSession(spark._jvm.org.apache.spark.sql.SparkSession.builder().getOrCreate())
table="crx--54gfg--hyg65hghg76768_t6ghh75y"
options = {
"sfURL" : args['URL'],
"sfAccount" : args['ACCOUNT'],
"sfUser" : args['USERNAME'],
"sfPassword" : args['PASSWORD'],
"sfDatabase" : args['DB'],
"sfSchema" : args['SCHEMA'],
"sfWarehouse" : args['WAREHOUSE'],
}
query=f"select * from {table}"
df = spark.read \
.format(SNOWFLAKE_SOURCE_NAME ) \
.options(**options) \
.option("query", query) \
.load()
display(df)
但我遇到了以下错误
net.snowflake.client.jdbc.snowflakesqlexception sql compilation error syntax error line 1 at position 111 unexpected '<EOF>'
我认为这主要是因为 db table 名称有一些特殊字符。 如何解决这个错误?
使用:
query=f'''select * from "{table}"'''
或:
table='''"crx--54gfg--hyg65hghg76768_t6ghh75y"'''
...
query=f"select * from {table}"
table 名称包含 -
,应该用 "
括起来。