如何从 AWS Glue 中的 python shell 作业连接和查询 MySQL 数据库

How to connect and query MySQL DB from python shell job in AWS Glue

我正在使用 sqlalchemy 创建连接和查询 mySQL 数据库,但是,胶水似乎不支持“sqlalchemy”甚至“pymysql”。有没有办法在 Glue python shell 作业上执行此操作?

我认为您需要安装 sqlalchemy 和 pymysql。如果您使用的是 Spark 运行时,Glue 可以很容易地安装额外的 py 库,但是 py shell 运行时似乎有点不同。

我让它工作的唯一方法是下载(或创建)whl 文件。幸运的是,您可以从 pypi 下载 sqlalchemy and pymysql。注意:如果您需要特定版本,sqlachemy whl 文件有很多选项。

将这两个 whl 文件放入 s3 存储桶中,您的胶水作业将可以访问这些文件。然后将两个路径(以逗号分隔)添加到 Job

中的 Python library path

示例。

s3://my-bucket/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl,s3://my-bucket/PyMySQL-1.0.2-py3-none-any.whl

那么你应该可以像这样导入它们

import sqlalchemy
import pymysql


print('sqlalchemy', sqlalchemy.__version__)
print('pymysql', pymysql.__version__)

May 7, 2022, 9:38:03 AM Pending execution
Processing ./glue-python-libs-ox4yhv_1/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting greenlet!=0.4.17; python_version >= "3" and (platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32"))))))
Downloading greenlet-1.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (147 kB)
Collecting importlib-metadata; python_version < "3.8"
Downloading importlib_metadata-4.8.3-py3-none-any.whl (17 kB)
Collecting zipp>=0.5
Downloading zipp-3.6.0-py3-none-any.whl (5.3 kB)
Collecting typing-extensions>=3.6.4; python_version < "3.8"
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Installing collected packages: greenlet, zipp, typing-extensions, importlib-metadata, SQLAlchemy
Successfully installed SQLAlchemy-1.4.36 greenlet-1.1.2 importlib-metadata-4.8.3 typing-extensions-4.1.1 zipp-3.6.0
Processing ./glue-python-libs-ox4yhv_1/PyMySQL-1.0.2-py3-none-any.whl
Installing collected packages: PyMySQL
Successfully installed PyMySQL-1.0.2
sqlalchemy 1.4.36 pymysql 1.0.2