Python Snowflake Connector 中的 Snowflake Copy Into 拒绝文件没有错误
Snowflake Copy Into rejecting file without error in Python Snowflake Connector
我正在使用 Python Snowflake Connector 将 JSON 文件放入 Snowflake Stage,然后 COPY INTO 将 JSON 插入 table。
这是我的代码:
import snowflake.connector
snowflake_conn = snowflake.connector.connect(
user=sf_user,
password=sf_password,
account=sf_account
)
role_init = "USE ROLE ELT_ROLE"
wh_init = "USE WAREHOUSE TEST_WH"
db_init = "USE DATABASE TEST_DB"
schema_init = "USE SCHEMA TEST_SCHEMA"
snowflake_conn_cur.execute(role_init)
snowflake_conn_cur.execute(wh_init)
snowflake_conn_cur.execute(db_init)
snowflake_conn_cur.execute(schema_init)
remove_file_command = 'REMOVE @TEST_STAGE/test_file.json;'
put_file_command = 'PUT file://test_file.json @TEST_STAGE;'
truncate_existing_table_data_command = 'TRUNCATE TABLE OUTPUT_TABLE;'
copy_file_command = 'COPY INTO OUTPUT_TABLE FROM @TEST_STAGE/test_file.json file_format=(TYPE=JSON) on_error=CONTINUE;'
snowflake_conn_cur.execute(remove_file_command)
snowflake_conn_cur.execute(put_file_command)
snowflake_conn_cur.execute(truncate_existing_table_data_command)
snowflake_conn_cur.execute(copy_file_command)
我的代码执行成功,但我在 Snowflake 中注意到该文件被拒绝(单独的问题)。
在 Snowflake Python 连接器中,在游标执行语句上,有没有办法让它 return 返回一个错误并使用它来验证它是否成功完成?
没有那个,基本上就是无声无息地失败了。我能想到的唯一其他方法是事后查询 table 以查看它是否有数据,但如果 table 没有事先被截断,这可能并不总是有帮助。
对于PUT/GET,默认情况下它应该return错误。对于您的示例,使用:
PUT file://test_file.json
在 Mac/Linux 机器上不正确(应该是 PUT file:///test_file.json)并且会默认生成堆栈跟踪,如这个例子:
cs = ctx.cursor()
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
cs.close()
ctx.close()
得到我:
$python basic_test.py
Traceback (most recent call last):
File "basic_test.py", line 37, in <module>
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 763, in execute
sf_file_transfer_agent.execute()
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/file_transfer_agent.py", line 366, in execute
self._init_file_metadata()
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/file_transfer_agent.py", line 966, in _init_file_metadata
Error.errorhandler_wrapper(
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 272, in errorhandler_wrapper
handed_over = Error.hand_to_other_handler(
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 327, in hand_to_other_handler
cursor.errorhandler(connection, cursor, error_class, error_value)
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 206, in default_errorhandler
raise error_class(
snowflake.connector.errors.ProgrammingError: 253006: 253006: File doesn't exist: ['Users/<user>/Downloads/result_00XXXX.csv']
您也可以使用 try/catch 块来获取错误:
cs = ctx.cursor()
try:
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
except Exception as err:
print(err)
finally:
cs.close()
ctx.close()
得到我:
$ python basic_test.py
253006: 253006: File doesn't exist: ['Users/<user>/Downloads/result_00XXX.csv']
删除 COPY INTO 函数中的“on_error=CONTINUE”命令以引发错误。尝试将文件加载到 table 时导致错误的问题是文件太大。
为了解决文件大小问题,由于我的JSON被封装到一个数组中,在COPY INTO命令的file_format上设置STRIP_OUTER_ARRAY=TRUE删除数组并将每个 JSON 节点加载到目标 table.
中它自己的行中
我正在使用 Python Snowflake Connector 将 JSON 文件放入 Snowflake Stage,然后 COPY INTO 将 JSON 插入 table。
这是我的代码:
import snowflake.connector
snowflake_conn = snowflake.connector.connect(
user=sf_user,
password=sf_password,
account=sf_account
)
role_init = "USE ROLE ELT_ROLE"
wh_init = "USE WAREHOUSE TEST_WH"
db_init = "USE DATABASE TEST_DB"
schema_init = "USE SCHEMA TEST_SCHEMA"
snowflake_conn_cur.execute(role_init)
snowflake_conn_cur.execute(wh_init)
snowflake_conn_cur.execute(db_init)
snowflake_conn_cur.execute(schema_init)
remove_file_command = 'REMOVE @TEST_STAGE/test_file.json;'
put_file_command = 'PUT file://test_file.json @TEST_STAGE;'
truncate_existing_table_data_command = 'TRUNCATE TABLE OUTPUT_TABLE;'
copy_file_command = 'COPY INTO OUTPUT_TABLE FROM @TEST_STAGE/test_file.json file_format=(TYPE=JSON) on_error=CONTINUE;'
snowflake_conn_cur.execute(remove_file_command)
snowflake_conn_cur.execute(put_file_command)
snowflake_conn_cur.execute(truncate_existing_table_data_command)
snowflake_conn_cur.execute(copy_file_command)
我的代码执行成功,但我在 Snowflake 中注意到该文件被拒绝(单独的问题)。
在 Snowflake Python 连接器中,在游标执行语句上,有没有办法让它 return 返回一个错误并使用它来验证它是否成功完成?
没有那个,基本上就是无声无息地失败了。我能想到的唯一其他方法是事后查询 table 以查看它是否有数据,但如果 table 没有事先被截断,这可能并不总是有帮助。
对于PUT/GET,默认情况下它应该return错误。对于您的示例,使用:
PUT file://test_file.json
在 Mac/Linux 机器上不正确(应该是 PUT file:///test_file.json)并且会默认生成堆栈跟踪,如这个例子:
cs = ctx.cursor()
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
cs.close()
ctx.close()
得到我:
$python basic_test.py
Traceback (most recent call last):
File "basic_test.py", line 37, in <module>
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 763, in execute
sf_file_transfer_agent.execute()
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/file_transfer_agent.py", line 366, in execute
self._init_file_metadata()
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/file_transfer_agent.py", line 966, in _init_file_metadata
Error.errorhandler_wrapper(
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 272, in errorhandler_wrapper
handed_over = Error.hand_to_other_handler(
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 327, in hand_to_other_handler
cursor.errorhandler(connection, cursor, error_class, error_value)
File "/Users/<user>/Documents/Connectors/python/snow/lib/python3.8/site-packages/snowflake/connector/errors.py", line 206, in default_errorhandler
raise error_class(
snowflake.connector.errors.ProgrammingError: 253006: 253006: File doesn't exist: ['Users/<user>/Downloads/result_00XXXX.csv']
您也可以使用 try/catch 块来获取错误:
cs = ctx.cursor()
try:
cs.execute("PUT file://Users/<user>/Downloads/result_00XXX.csv @~")
except Exception as err:
print(err)
finally:
cs.close()
ctx.close()
得到我:
$ python basic_test.py
253006: 253006: File doesn't exist: ['Users/<user>/Downloads/result_00XXX.csv']
删除 COPY INTO 函数中的“on_error=CONTINUE”命令以引发错误。尝试将文件加载到 table 时导致错误的问题是文件太大。
为了解决文件大小问题,由于我的JSON被封装到一个数组中,在COPY INTO命令的file_format上设置STRIP_OUTER_ARRAY=TRUE删除数组并将每个 JSON 节点加载到目标 table.
中它自己的行中