为什么我无法在 Hadoop 中获取文件名并以格式（Word 文件名计数）显示它？

Question

输入是一个名为 Wiki-micro.txt 的文本文件...字数统计程序运行很好..我需要修改它并将其输出格式从 (Word count) 到 (Word####Filename count) 我想要我的输出格式（Word#####Filename count），你能告诉我我哪里出错了吗？我使用了 Input Split 但它不起作用.. 请帮助我。

  public static class Map extends Mapper<LongWritable ,  Text ,  Text ,  IntWritable > {
  private final static IntWritable one  = new IntWritable( 1);
  private Text word  = new Text();

  private static final Pattern WORD_BOUNDARY = Pattern .compile("\s*\b\s*");

  public void map( LongWritable offset,  Text lineText,  Context context)
    throws  IOException,  InterruptedException {

     String line  = lineText.toString();
     Text currentWord  = new Text();
     InputSplit input_split = context.getInputSplit();
     String FName = ((FileSplit) input_split).getPath().getName();

     for ( String word  : WORD_BOUNDARY .split(line)) {
        if (word.isEmpty()) {
           continue;
        }
        currentWord  = new Text(word);
        context.write(currentWord, one);
        context.write(new Text(FName), one);
     }
  }

}

Answer 1

不确定，但是如果替换最后 3 行会发生什么：

        currentWord  = new Text(word);
        context.write(currentWord, one);
        context.write(new Text(FName), one);

和

        currentWord  = new Text(word + "####" + FName);
        context.write(currentWord, one);
        context.write(new Text(FName), one);

为什么我无法在 Hadoop 中获取文件名并以格式（Word 文件名计数）显示它？

Why am i unable to getFileName & display it in format (Word Filename Counts) In Hadoop?

java

hadoop

mapreduce