无法在 Hadoop 的 MapReduce 代码中的 ArrayList<String> 中“.add(StringTokenizer.nextToken())”
Failed to ".add(StringTokenizer.nextToken())" in an ArrayList<String> inside Hadoop's MapReducer code
我正在尝试将 StringTokenizer.nextToken() 添加到我的 Hadoop Map Reduce 代码中的 ArrayList。该代码工作正常并且有一个输出文件一次 运行,但是一旦我添加了一个 SstringTokenizer 行它突然中断了。
这是我的代码:
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
List<String> texts = new ArrayList<String>();
StringTokenizer itr = new StringTokenizer(value.toString(), "P");
while (itr.hasMoreTokens()) {
System.out.println(itr.nextToken());
texts.add(itr.nextToken()); //The code broke here
}
}
注意我没有添加 Hadoop 的文本 Class 来编写这段代码,但它适用于我以前的代码。
这是我的 Reducer
public static class IntSumReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
这是.main
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(JobCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
```
Note: I've also tried using the normal Array and it still broke.
The project is running on Java 8 jdk and has imported Maven's HadoopCommon version 3.3.0 and HadoopCore of 1.2.0 [Mac OS]
这是我的错误日志:
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/Users/domesama/.m2/repository/org/apache/hadoop/hadoop-core/1.2.1/hadoop-core-1.2.1.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/09/15 14:18:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/15 14:18:07 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
20/09/15 14:18:07 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
20/09/15 14:18:07 INFO input.FileInputFormat: Total input paths to process : 1
20/09/15 14:18:07 WARN snappy.LoadSnappy: Snappy native library not loaded
20/09/15 14:18:07 INFO mapred.JobClient: Running job: job_local1465674096_0001
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Waiting for map tasks
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Starting task: attempt_local1465674096_0001_m_000000_0
20/09/15 14:18:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null
20/09/15 14:18:07 INFO mapred.MapTask: Processing split: file:/Users/domesama/Desktop/Github Respositories/HadoopMapReduce/input/SampleFile.txt:0+1891
20/09/15 14:18:07 INFO mapred.MapTask: io.sort.mb = 100
20/09/15 14:18:07 INFO mapred.MapTask: data buffer = 79691776/99614720
20/09/15 14:18:07 INFO mapred.MapTask: record buffer = 262144/327680
20/09/15 14:18:07 INFO mapred.MapTask: Starting flush of map output
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Map task executor complete.
20/09/15 14:18:07 WARN mapred.LocalJobRunner: job_local1465674096_0001
java.lang.Exception: java.util.NoSuchElementException
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.util.NoSuchElementException
at java.base/java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at JobCount$TokenizerMapper.map(JobCount.java:50)
at JobCount$TokenizerMapper.map(JobCount.java:20)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
,84,01,02600,01,1007549,00065,19,1,,,2,2,2,2,2,,,2,,2,,,,1,2,2,2,2,2,2,0000000,,,,2,5,,,,,,1,4,,,,,,,,,,2,5,2,2,3,000000,00000,17,000000,2,15,19,0000000,2,00000,00000,0000000,,3,,2,,4,999,999,,2,,,6,,,1,01,,,,,,,6,,1,,0,,,000000000,000000000,028,,,,1,2,1,1,01,001,0,0,0,0,1,0,0,1,0,,,,,,,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,0,,0,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,00005,00127,00065,00066,00069,00005,00120,00066,00063,00005,00067,00006,00005,00137,00124,00065,00066,00064,00063,00006,00131,00006,00062,00063,00060,00126,00006,00066,00068,00120,00066,00126,00115,00005,00005,00063,00066,00066,00062,00005,00118,00006,00064,00066,00062,00124,00006,00063,00068,00132,00062,00119,00126,00006,00005,00068,00072,00065,00066,00125,00005,00123,00062,00064,00065,00006,00123,00065,00067,00006,00068,00006,00005,00127,00119,00063,00068,00067,00064,00122
20/09/15 14:18:08 INFO mapred.JobClient: map 0% reduce 0%
20/09/15 14:18:08 INFO mapred.JobClient: Job complete: job_local1465674096_0001
20/09/15 14:18:08 INFO mapred.JobClient: Counters: 0
System.out.print(itr.nextToken());确实也打印了,但它似乎以某种方式执行了
texts.add(itr.nextToken()); //The code broke here
也许我的代码中可能需要类似 await async 的东西(比如在 JS 中)?
如果您使用 StringTokenizer
,您总是需要调用 hasMoreTokens()
方法来检查在调用 nextToken()
之前是否还有任何令牌,而在您的代码中您调用 nextToken()
两次。
修复应该只是在循环中调用 nextToken()
一次。
while (itr.hasMoreTokens()) {
String token = itr.nextToken(); // one call for each hasMoreTokens
System.out.println(token);
texts.add(token);
}
我正在尝试将 StringTokenizer.nextToken() 添加到我的 Hadoop Map Reduce 代码中的 ArrayList。该代码工作正常并且有一个输出文件一次 运行,但是一旦我添加了一个 SstringTokenizer 行它突然中断了。
这是我的代码:
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
List<String> texts = new ArrayList<String>();
StringTokenizer itr = new StringTokenizer(value.toString(), "P");
while (itr.hasMoreTokens()) {
System.out.println(itr.nextToken());
texts.add(itr.nextToken()); //The code broke here
}
}
注意我没有添加 Hadoop 的文本 Class 来编写这段代码,但它适用于我以前的代码。
这是我的 Reducer
public static class IntSumReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
这是.main
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(JobCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
```
Note: I've also tried using the normal Array and it still broke.
The project is running on Java 8 jdk and has imported Maven's HadoopCommon version 3.3.0 and HadoopCore of 1.2.0 [Mac OS]
这是我的错误日志:
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/Users/domesama/.m2/repository/org/apache/hadoop/hadoop-core/1.2.1/hadoop-core-1.2.1.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/09/15 14:18:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/09/15 14:18:07 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
20/09/15 14:18:07 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
20/09/15 14:18:07 INFO input.FileInputFormat: Total input paths to process : 1
20/09/15 14:18:07 WARN snappy.LoadSnappy: Snappy native library not loaded
20/09/15 14:18:07 INFO mapred.JobClient: Running job: job_local1465674096_0001
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Waiting for map tasks
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Starting task: attempt_local1465674096_0001_m_000000_0
20/09/15 14:18:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null
20/09/15 14:18:07 INFO mapred.MapTask: Processing split: file:/Users/domesama/Desktop/Github Respositories/HadoopMapReduce/input/SampleFile.txt:0+1891
20/09/15 14:18:07 INFO mapred.MapTask: io.sort.mb = 100
20/09/15 14:18:07 INFO mapred.MapTask: data buffer = 79691776/99614720
20/09/15 14:18:07 INFO mapred.MapTask: record buffer = 262144/327680
20/09/15 14:18:07 INFO mapred.MapTask: Starting flush of map output
20/09/15 14:18:07 INFO mapred.LocalJobRunner: Map task executor complete.
20/09/15 14:18:07 WARN mapred.LocalJobRunner: job_local1465674096_0001
java.lang.Exception: java.util.NoSuchElementException
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.util.NoSuchElementException
at java.base/java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at JobCount$TokenizerMapper.map(JobCount.java:50)
at JobCount$TokenizerMapper.map(JobCount.java:20)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
,84,01,02600,01,1007549,00065,19,1,,,2,2,2,2,2,,,2,,2,,,,1,2,2,2,2,2,2,0000000,,,,2,5,,,,,,1,4,,,,,,,,,,2,5,2,2,3,000000,00000,17,000000,2,15,19,0000000,2,00000,00000,0000000,,3,,2,,4,999,999,,2,,,6,,,1,01,,,,,,,6,,1,,0,,,000000000,000000000,028,,,,1,2,1,1,01,001,0,0,0,0,1,0,0,1,0,,,,,,,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,0,,0,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,00005,00127,00065,00066,00069,00005,00120,00066,00063,00005,00067,00006,00005,00137,00124,00065,00066,00064,00063,00006,00131,00006,00062,00063,00060,00126,00006,00066,00068,00120,00066,00126,00115,00005,00005,00063,00066,00066,00062,00005,00118,00006,00064,00066,00062,00124,00006,00063,00068,00132,00062,00119,00126,00006,00005,00068,00072,00065,00066,00125,00005,00123,00062,00064,00065,00006,00123,00065,00067,00006,00068,00006,00005,00127,00119,00063,00068,00067,00064,00122
20/09/15 14:18:08 INFO mapred.JobClient: map 0% reduce 0%
20/09/15 14:18:08 INFO mapred.JobClient: Job complete: job_local1465674096_0001
20/09/15 14:18:08 INFO mapred.JobClient: Counters: 0
System.out.print(itr.nextToken());确实也打印了,但它似乎以某种方式执行了
texts.add(itr.nextToken()); //The code broke here
也许我的代码中可能需要类似 await async 的东西(比如在 JS 中)?
如果您使用 StringTokenizer
,您总是需要调用 hasMoreTokens()
方法来检查在调用 nextToken()
之前是否还有任何令牌,而在您的代码中您调用 nextToken()
两次。
修复应该只是在循环中调用 nextToken()
一次。
while (itr.hasMoreTokens()) {
String token = itr.nextToken(); // one call for each hasMoreTokens
System.out.println(token);
texts.add(token);
}