HDFS + 运行 linux 远程机器上的 hdfs 命令

HDFS + run hdfs commands on linux remote machines

我们要执行如下简单命令

使用ssh登录$hadoop_machine机器

和 运行s hdfs cli 作为 hdfs fsck / ,来自用户 hdfs

所以我们运行以下

ssh $hadoop_machine su hdfs -c 'hdfs fsck /' 

但我们得到

Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND
       where COMMAND is one of:
  dfs                  run a filesystem command on the file systems supported in Hadoop.
  classpath            prints the classpath
  namenode -format     format the DFS filesystem
  secondarynamenode    run the DFS secondary namenode
  namenode             run the DFS namenode
  journalnode          run the DFS journalnode
  zkfc                 run the ZK Failover Controller daemon
  datanode             run a DFS datanode
  dfsadmin             run a DFS admin client
  envvars              display computed Hadoop environment variables
  haadmin              run a DFS HA admin client
  fsck                 run a DFS filesystem checking utility
  balancer             run a cluster balancing utility
  jmxget               get JMX exported values from NameNode or DataNode.
  mover                run a utility to move block replicas across
                       storage types
  oiv                  apply the offline fsimage viewer to an fsimage
  oiv_legacy           apply the offline fsimage viewer to an legacy fsimage
  oev                  apply the offline edits viewer to an edits file
  fetchdt              fetch a delegation token from the NameNode
  getconf              get config values from configuration
  groups               get the groups which users belong to
  snapshotDiff         diff two snapshots of a directory or diff the
                       current directory contents with a snapshot
  lsSnapshottableDir   list all snapshottable dirs owned by the current user
                                                Use -help to see options
  portmap              run a portmap service
  nfs3                 run an NFS version 3 gateway
  cacheadmin           configure the HDFS cache
  crypto               configure HDFS encryption zones
  storagepolicies      list/get/set block storage policies
  version              print the version

Most commands print help when invoked w/o parameters.

为什么我们不能通过用户 HDFS 在远程机器上执行 hdfs 任务?

ssh $hadoop_machine su hdfs -c 'hdfs fsck /' 

当您 运行 这样做时,单引号由您的本地 shell 实例处理。 ssh 向远程系统上的 运行 请求的命令是:

su hdfs -c hdfs fsck /

su 将“-c”后面的参数解释为 运行 的命令。参数是“hdfs”,所以 su 不带任何参数调用 hdfs

您需要 运行 以将引号传递到远程系统的方式进行 ssh。这应该有效:

ssh $hadoop_machine su hdfs -c '"hdfs fsck /"' 
or
ssh $hadoop_machine 'su hdfs -c "hdfs fsck /"'

其中任何一个都会导致 ssh 请求调用:

su hdfs -c "hdfs fsck /"