java.lang.ClassNotFoundException:com.mysql.jdbc.Driver 在 AWS EMR 集群上

java.lang.ClassNotFoundException: com.mysql.jdbc.Driver on AWS EMR cluster

我的代码是:

APP_NAME = "mysql_query"

if __name__ == "__main__":
    conf = SparkConf().setAppName(APP_NAME)
    conf = conf.setMaster("local[*]")
    sc = SparkContext(conf=conf)
    sqlContext = SQLContext(sc) 

hostname = "hostname"
dbname = "database_name"
jdbcPort = 3306
username = "username"
password = "password"
jdbc_url = "jdbc:mysql://{}:{}/{}?user={}&password={}".format(hostname, jdbcPort, dbname, username, password)

query = "(SELECT * XXXXXXX_XXXX_XXX_XX) t1_alias"

df = sqlContext.read.format('jdbc').options(driver='com.mysql.jdbc.Driver', url=jdbc_url, dbtable=query).load()

此代码当前位于 S3 存储桶中。我已经通过 SSH 连接到 EMR 主节点,每次我使用 spark-submit --master yarn --deploy-mode cluster mysql_spark.py 提交代码时,我都会收到错误 - java.lang.ClassNotFoundException: com.mysql.jdbc.Driver.

我已经安装了所需的 jdbc 驱动程序。这里有什么问题?求助!

试试下面 -

spark-submit --master yarn \
  --deploy-mode cluster \
  --jars mysql-connector-java-8.0.19.jar \
  --driver-class-path mysql-connector-java-8.0.19.jar \
  --conf spark.executor.extraClassPath=mysql-connector-java-8.0.19.jar \
  mysql_spark.py

ref