使用oozie在hdfs中创建目录
Creating directory in hdfs using oozie
在 oozie 工作流中,我们如何在 HDFS
中创建目录并将文件从 Linux
复制到 HDFS
我想在工作流中执行以下操作
hdfs dfs -mkdir -p /user/$USER/logging/`date "+%Y-%m-%d"`/logs
hdfs dfs -put /home/$USER/logs/"${table}" /user/$USER/logging/`date "+%Y-%m-%d"`/logs/
我怎样才能做到这一点?
我尝试了以下但没有成功
<action name="Copy_to_HDFS">
<fs>
<mkdir path='/user/$USER/logging/`date "+%Y-%m-%d"`/logs'/>
<move source='/home/$USER/logs${table}' target='/user/$USER/logging/`date "+%Y-%m-%d"`/logs/'/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
我们如何创建一个以特定日期命名的文件夹?
完整工作流程:
<workflow-app name="Shell_hive" xmlns="uri:oozie:workflow:0.5">
<start to="shell-b8e7"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-b8e7">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>shell.sh</exec>
<argument>${table}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<env-var>HADOOP_CONF_DIR=/etc/hadoop/conf</env-var>
<file>/user/$USER/oozie/scripts/lib/shell.sh#shell.sh</file>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<action name="Copy_to_HDFS">
<fs>
<mkdir path="/user/$USER/logging/2017-04-24/logs"/>
<move source="/tmp/logging/${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
<end name="End"/>
</workflow-app>
问题出在引号、单引号('
) 阻止变量扩展。
<action name="Copy_to_HDFS">
<fs>
<mkdir path="/user/$USER/logging/$(date +%Y-%m-%d)/logs"/>
<move source="/home/$USER/logs${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
更新:
Copy_to_HDFS
操作永远不会被调用。成功时 shell-b8e7
操作被发送到 End
操作。相应地修改工作流以调用 Copy_to_HDFS
操作。例如,
</shell>
<ok to="Copy_to_HDFS"/>
prepare 标签可能有帮助:
<prepare>
<delete path="[PATH]"/>
...
<mkdir path="[PATH]"/>
...
</prepare>
在 oozie 工作流中,我们如何在 HDFS
中创建目录并将文件从 Linux
复制到 HDFS
我想在工作流中执行以下操作
hdfs dfs -mkdir -p /user/$USER/logging/`date "+%Y-%m-%d"`/logs
hdfs dfs -put /home/$USER/logs/"${table}" /user/$USER/logging/`date "+%Y-%m-%d"`/logs/
我怎样才能做到这一点?
我尝试了以下但没有成功
<action name="Copy_to_HDFS">
<fs>
<mkdir path='/user/$USER/logging/`date "+%Y-%m-%d"`/logs'/>
<move source='/home/$USER/logs${table}' target='/user/$USER/logging/`date "+%Y-%m-%d"`/logs/'/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
我们如何创建一个以特定日期命名的文件夹?
完整工作流程:
<workflow-app name="Shell_hive" xmlns="uri:oozie:workflow:0.5">
<start to="shell-b8e7"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="shell-b8e7">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>shell.sh</exec>
<argument>${table}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<env-var>HADOOP_CONF_DIR=/etc/hadoop/conf</env-var>
<file>/user/$USER/oozie/scripts/lib/shell.sh#shell.sh</file>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<action name="Copy_to_HDFS">
<fs>
<mkdir path="/user/$USER/logging/2017-04-24/logs"/>
<move source="/tmp/logging/${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
<end name="End"/>
</workflow-app>
问题出在引号、单引号('
) 阻止变量扩展。
<action name="Copy_to_HDFS">
<fs>
<mkdir path="/user/$USER/logging/$(date +%Y-%m-%d)/logs"/>
<move source="/home/$USER/logs${table}" target="/user/$USER/logging/$(date +%Y-%m-%d)/logs/"/>
</fs>
<ok to="end"/>
<error to="end"/>
</action>
更新:
Copy_to_HDFS
操作永远不会被调用。成功时 shell-b8e7
操作被发送到 End
操作。相应地修改工作流以调用 Copy_to_HDFS
操作。例如,
</shell>
<ok to="Copy_to_HDFS"/>
prepare 标签可能有帮助:
<prepare>
<delete path="[PATH]"/>
...
<mkdir path="[PATH]"/>
...
</prepare>