为什么我无法在 Hadoop 中获取文件名并以格式(Word 文件名计数)显示它?
Why am i unable to getFileName & display it in format (Word Filename Counts) In Hadoop?
输入是一个名为 Wiki-micro.txt 的文本文件...字数统计程序 运行 很好..我需要修改它并将其输出格式从 (Word count) 到 (Word####Filename count)
我想要我的输出格式(Word#####Filename count),你能告诉我我哪里出错了吗?我使用了 Input Split 但它不起作用.. 请帮助我。
public static class Map extends Mapper<LongWritable , Text , Text , IntWritable > {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();
private static final Pattern WORD_BOUNDARY = Pattern .compile("\s*\b\s*");
public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {
String line = lineText.toString();
Text currentWord = new Text();
InputSplit input_split = context.getInputSplit();
String FName = ((FileSplit) input_split).getPath().getName();
for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord, one);
context.write(new Text(FName), one);
}
}
}
不确定,但是如果替换最后 3 行会发生什么:
currentWord = new Text(word);
context.write(currentWord, one);
context.write(new Text(FName), one);
和
currentWord = new Text(word + "####" + FName);
context.write(currentWord, one);
context.write(new Text(FName), one);
输入是一个名为 Wiki-micro.txt 的文本文件...字数统计程序 运行 很好..我需要修改它并将其输出格式从 (Word count) 到 (Word####Filename count) 我想要我的输出格式(Word#####Filename count),你能告诉我我哪里出错了吗?我使用了 Input Split 但它不起作用.. 请帮助我。
public static class Map extends Mapper<LongWritable , Text , Text , IntWritable > {
private final static IntWritable one = new IntWritable( 1);
private Text word = new Text();
private static final Pattern WORD_BOUNDARY = Pattern .compile("\s*\b\s*");
public void map( LongWritable offset, Text lineText, Context context)
throws IOException, InterruptedException {
String line = lineText.toString();
Text currentWord = new Text();
InputSplit input_split = context.getInputSplit();
String FName = ((FileSplit) input_split).getPath().getName();
for ( String word : WORD_BOUNDARY .split(line)) {
if (word.isEmpty()) {
continue;
}
currentWord = new Text(word);
context.write(currentWord, one);
context.write(new Text(FName), one);
}
}
}
不确定,但是如果替换最后 3 行会发生什么:
currentWord = new Text(word);
context.write(currentWord, one);
context.write(new Text(FName), one);
和
currentWord = new Text(word + "####" + FName);
context.write(currentWord, one);
context.write(new Text(FName), one);