'hdfs dfs -cp' 是否使用 /tmp 作为其实现的一部分

Does 'hdfs dfs -cp' use /tmp as part of its implementation

正在尝试调查 /tmp 已满但我们不知道是什么原因造成的问题。我们确实有一个最近的变化,它使用 HDFS 命令执行到另一台主机的复制(hdfs dfs -cp /source/file hdfs://other.host:port/target/file,虽然复制操作不直接接触或引用 /tmp,但它可能将其用作其实现的一部分.

但我在文档中找不到任何内容来证实或反驳该理论 - 还有其他人知道答案吗?

你可以看看代码:

这是复制代码using HDFS。 它使用它自己的内部 CommandWithDestination class。 并使用另一个 internal class which is really just java.io. classes. (To complete the actual write.) So it's buffering byte's in memory and sending the bytes around. Likely not the issue. You could check this by altering the tmp directory used by java. (java.io.tmpdir)

编写所有内容

export _JAVA_OPTIONS=-Djava.io.tmpdir=/new/tmp/dir

According to the java.io.File Java Docs

The default temporary-file directory is specified by the system property java.io.tmpdir. On UNIX systems the default value of this property is typically "/tmp" or "/var/tmp"; on Microsoft Windows systems it is typically "c:\temp". A different value may be given to this system property when the Java virtual machine is invoked, but programmatic changes to this property are not guaranteed to have any effect upon the the temporary directory used by this method.

HDFS copy使用的方法:

protected void copyStreamToTarget(InputStream in, PathData target)
  throws IOException {
    if (target.exists && (target.stat.isDirectory() || !overwrite)) {
      throw new PathExistsException(target.toString());
    }
    TargetFileSystem targetFs = new TargetFileSystem(target.fs);
    try {
        System.out.flush();
        System.out.println("Hello Copy Stream");
      PathData tempTarget = direct ? target : target.suffix("._COPYING_");
      targetFs.setWriteChecksum(writeChecksum);
      targetFs.writeStreamToFile(in, tempTarget, lazyPersist, direct); //here's where it uses Java.io to write the file to hdfs.
      if (!direct) {
        targetFs.rename(tempTarget, target);
      }
    } finally {
      targetFs.close(); // last ditch effort to ensure temp file is removed
    }
  }