Apache Oozie 运行 docker 容器可以吗?
Can Apache Oozie run docker containers?
目前正在比较基于 DAG 的工作流工具,如 Airflow 和 Luigi,用于调度通用 docker 容器和 Spark 作业。
Apache Oozie 运行 通用 Docker 容器能否通过其 shell
操作?或者 Oozie 是否严格适用于 Pig 和 Hive 等 Hadoop 工具?
Oozie is integrated with the rest of the Hadoop stack supporting
several types of Hadoop jobs out of the box (such as Java map-reduce,
Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system
specific jobs (such as Java programs and shell scripts).
我已经尝试通过 Shell 操作 运行 Docker 容器并且它正在工作。由于 Shell 操作可以在集群的任何节点上执行,因此 Docker 必须安装在任何节点上。
workflow.xml 从色调创建
<workflow-app name="Test docker" xmlns="uri:oozie:workflow:0.5">
<start to="shell-5c29"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-5c29">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>test_docker.sh</exec>
<file>/test_docker.sh#test_docker.sh</file>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
test_docker.sh
docker run hello-world > output.txt
hdfs dfs -put -f output.txt /output.txt
echo 'done'
已生成 output.txt 的内容
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub (amd64)
3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
目前正在比较基于 DAG 的工作流工具,如 Airflow 和 Luigi,用于调度通用 docker 容器和 Spark 作业。
Apache Oozie 运行 通用 Docker 容器能否通过其 shell
操作?或者 Oozie 是否严格适用于 Pig 和 Hive 等 Hadoop 工具?
Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
我已经尝试通过 Shell 操作 运行 Docker 容器并且它正在工作。由于 Shell 操作可以在集群的任何节点上执行,因此 Docker 必须安装在任何节点上。
workflow.xml 从色调创建
<workflow-app name="Test docker" xmlns="uri:oozie:workflow:0.5">
<start to="shell-5c29"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-5c29">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>test_docker.sh</exec>
<file>/test_docker.sh#test_docker.sh</file>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
test_docker.sh
docker run hello-world > output.txt
hdfs dfs -put -f output.txt /output.txt
echo 'done'
已生成 output.txt 的内容
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub (amd64)
3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/