如何在 PyAthena 中处理错误并重试?
How do I handle errors and retry in PyAthena?
我每天 运行 从我的本地 Ubuntu 机器有一个 Athena 查询。大多数时候 运行 都很好。
def get_athena_data(**kwargs):
athena_conn = connect(aws_access_key_id = access_key, aws_secret_access_key = s_key, s3_staging_dir = path, region_name = region)
print(f"{datetime.today().strftime('%Y-%m-%d %H:%M.%S')} Athena connection established; starting to query data using pd-sql integration")
load_data = pd.read_sql(sql,athena_conn)
return load_data
然而,前几天我得到了(这是数英里长,所以我用了几次 SNIP):
Traceback (most recent call last):
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1586, in execute
cur.execute(*args, **kwargs)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/util.py", line 306, in _wrapper
return wrapped(*args, **kwargs)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/cursor.py", line 79, in execute
raise OperationalError(query_execution.state_change_reason)
pyathena.error.OperationalError: GENERIC_INTERNAL_ERROR: Unable to create class ... SNIP ...
]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1590, in execute
self.con.rollback()
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/connection.py", line 184, in rollback
raise NotSupportedError
pyathena.error.NotSupportedError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ken/Documents/projects/site_alerts/code/Site_alerts_v18.py", line 174, in <module>
sql_results = get_athena_data(SNIP)
File "/home/ken/Documents/projects/site_alerts/code/site_alert_functions_v18.py", line 351, in get_athena_data
load_data = pd.read_sql(sql,athena_conn)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 412, in read_sql
chunksize=chunksize,
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1633, in read_query
cursor = self.execute(*args)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
raise ex from inner_exc
pandas.io.sql.DatabaseError: Execution failed on sql: ... SNIP ...
GENERIC_INTERNAL_ERROR: Unable to create class ... SNIP ...
]
unable to rollback
好的,所以我需要处理错误,如果失败我想重试。所以我尝试了:
def retry(func, max_tries=5):
for i in range(max_tries):
try:
func()
print('completed successfully')
break
except Exception:
print('error')
continue
retry(get_athena_data(ARGS))
但是,这是行不通的。当 Athena 失败时它仍然会停止执行(我输入了一个有缺陷的 sql 查询来模拟)。
如何处理异常并执行重试?
我在 Pyathena 问题中找到了 this,但它对我来说毫无意义,也没有使用说明。
您正在调用函数 get_athena_data
并将其 return 传递给函数 retry
,而不是函数。
这样试试:retry(get_athena_data)
。
(更新)
现在传递一些参数:
def retry(func, max_tries=5, *args, **kwargs):
for i in range(max_tries):
try:
func(*args, **kwargs)
print('completed successfully')
break
except Exception:
print('error')
continue
retry(get_athena_data, arg1, arg2, kwarg1="foo", kwarg2="bar")
我每天 运行 从我的本地 Ubuntu 机器有一个 Athena 查询。大多数时候 运行 都很好。
def get_athena_data(**kwargs):
athena_conn = connect(aws_access_key_id = access_key, aws_secret_access_key = s_key, s3_staging_dir = path, region_name = region)
print(f"{datetime.today().strftime('%Y-%m-%d %H:%M.%S')} Athena connection established; starting to query data using pd-sql integration")
load_data = pd.read_sql(sql,athena_conn)
return load_data
然而,前几天我得到了(这是数英里长,所以我用了几次 SNIP):
Traceback (most recent call last):
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1586, in execute
cur.execute(*args, **kwargs)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/util.py", line 306, in _wrapper
return wrapped(*args, **kwargs)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/cursor.py", line 79, in execute
raise OperationalError(query_execution.state_change_reason)
pyathena.error.OperationalError: GENERIC_INTERNAL_ERROR: Unable to create class ... SNIP ...
]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1590, in execute
self.con.rollback()
File "/home/ken/anaconda3/lib/python3.7/site-packages/pyathena/connection.py", line 184, in rollback
raise NotSupportedError
pyathena.error.NotSupportedError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ken/Documents/projects/site_alerts/code/Site_alerts_v18.py", line 174, in <module>
sql_results = get_athena_data(SNIP)
File "/home/ken/Documents/projects/site_alerts/code/site_alert_functions_v18.py", line 351, in get_athena_data
load_data = pd.read_sql(sql,athena_conn)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 412, in read_sql
chunksize=chunksize,
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1633, in read_query
cursor = self.execute(*args)
File "/home/ken/anaconda3/lib/python3.7/site-packages/pandas/io/sql.py", line 1595, in execute
raise ex from inner_exc
pandas.io.sql.DatabaseError: Execution failed on sql: ... SNIP ...
GENERIC_INTERNAL_ERROR: Unable to create class ... SNIP ...
]
unable to rollback
好的,所以我需要处理错误,如果失败我想重试。所以我尝试了:
def retry(func, max_tries=5):
for i in range(max_tries):
try:
func()
print('completed successfully')
break
except Exception:
print('error')
continue
retry(get_athena_data(ARGS))
但是,这是行不通的。当 Athena 失败时它仍然会停止执行(我输入了一个有缺陷的 sql 查询来模拟)。
如何处理异常并执行重试?
我在 Pyathena 问题中找到了 this,但它对我来说毫无意义,也没有使用说明。
您正在调用函数 get_athena_data
并将其 return 传递给函数 retry
,而不是函数。
这样试试:retry(get_athena_data)
。
(更新) 现在传递一些参数:
def retry(func, max_tries=5, *args, **kwargs):
for i in range(max_tries):
try:
func(*args, **kwargs)
print('completed successfully')
break
except Exception:
print('error')
continue
retry(get_athena_data, arg1, arg2, kwarg1="foo", kwarg2="bar")