Google Cloud Dataproc 支持的 OSS
OSS supported by Google Cloud Dataproc
当我去https://cloud.google.com/dataproc的时候,我看到了这个...
“Dataproc 是一项完全托管且高度可扩展的服务,适用于 运行 Apache Spark、Apache Flink、Presto 和 30 多种开源工具和框架。”
但是gcloud dataproc jobs submit
并没有列出所有这些。它只列出了 8 个(hadoop、hive、pig、presto、pyspark、spark、spark-r、spark-sql)。知道为什么吗?
~ gcloud dataproc jobs submit
ERROR: (gcloud.dataproc.jobs.submit) Command name argument expected.
Available commands for gcloud dataproc jobs submit:
hadoop Submit a Hadoop job to a cluster.
hive Submit a Hive job to a cluster.
pig Submit a Pig job to a cluster.
presto Submit a Presto job to a cluster.
pyspark Submit a PySpark job to a cluster.
spark Submit a Spark job to a cluster.
spark-r Submit a SparkR job to a cluster.
spark-sql Submit a Spark SQL job to a cluster.
For detailed information on this command and its flags, run:
gcloud dataproc jobs submit --help
一些 OSS 组件被提供为 Dataproc Optional Components. Not of all them have a job submit API, some (e.g., Anaconda, Jupyter) don't need one, some (e.g., Flink, Druid) 可能会在未来添加。
其他一些 OSS 组件作为库提供,例如 GCS connector, BigQuery connector、Apache Parquet。
当我去https://cloud.google.com/dataproc的时候,我看到了这个...
“Dataproc 是一项完全托管且高度可扩展的服务,适用于 运行 Apache Spark、Apache Flink、Presto 和 30 多种开源工具和框架。”
但是gcloud dataproc jobs submit
并没有列出所有这些。它只列出了 8 个(hadoop、hive、pig、presto、pyspark、spark、spark-r、spark-sql)。知道为什么吗?
~ gcloud dataproc jobs submit
ERROR: (gcloud.dataproc.jobs.submit) Command name argument expected.
Available commands for gcloud dataproc jobs submit:
hadoop Submit a Hadoop job to a cluster.
hive Submit a Hive job to a cluster.
pig Submit a Pig job to a cluster.
presto Submit a Presto job to a cluster.
pyspark Submit a PySpark job to a cluster.
spark Submit a Spark job to a cluster.
spark-r Submit a SparkR job to a cluster.
spark-sql Submit a Spark SQL job to a cluster.
For detailed information on this command and its flags, run:
gcloud dataproc jobs submit --help
一些 OSS 组件被提供为 Dataproc Optional Components. Not of all them have a job submit API, some (e.g., Anaconda, Jupyter) don't need one, some (e.g., Flink, Druid) 可能会在未来添加。
其他一些 OSS 组件作为库提供,例如 GCS connector, BigQuery connector、Apache Parquet。