无法在我的 MapReduce 代码中找到 String Index Out of Bound Exception 的原因

Question

我正在尝试使用 Java 编写的 MapReduce 代码。我需要获得在每个部门晋升但仍离开组织的员工人数。我正在尝试传递部门和晋升的串联值作为键，辞职作为值。

示例数据

左promotion_last_5years系

1、0，销售额

1、1，销售额

1, 1, 小时

1、0，销售额

映射程序代码：

public void map(LongWritable key, Text text, Context context) throws IOException, InterruptedException
 {
     String row = text.toString();
        String [] values = row.trim().split(",");
        int left = 0;
        int promotion = 0;
        String dept = "";
        String DeptPromoted = "";
        try
        {
            if(values.length == 10 && !header.equals(row))
            {
                left = Integer.parseInt(values[6]);
                promotion = Integer.parseInt(values[7]);
                dept = values[8];
                DeptPromoted = dept+"-"+values[7];  //  sales-0                 
            }
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
        context.write(new Text(DeptPromoted), new IntWritable(left)); //sales-0 1
    }

下面是我的 reducer 代码，我在其中使用子字符串来分隔部门和晋升的值，然后使用这些值来计算晋升但辞职的员工人数。

减速器代码：

public void reduce(Text key, Iterable<IntWritable> values, Context context throws IOException, InterruptedException
{
    //sales-0   1
    int count = 0;
    String str = "";
    str = key.toString();   //sales-0
    int len = str.length(); //7
    char L = str.charAt(len - 1);
    if (L == '1')
    {
        for (IntWritable val: values)
        {
            if(val.get() == 1)
            {
                count++;
            }
        }
    }
    context.write(key, new IntWritable(count));
}

我相信 StringIndexOutofBoundException 来自 reducer，我试图在其中获取字符串末尾的字符值。有人可以帮助解决以下错误吗？

Error: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.charAt(String.java:658)
at com.df.hra_promleft.PromLeftReducer.reduce(PromLeftReducer.java:18)
at com.df.hra_promleft.PromLeftReducer.reduce(PromLeftReducer.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Answer 1

IndexOutOfBoundException 可能是因为代码：-

char L = str.charAt(len - 1);

原因如下：-

您的数据包括：- 1 0 次销售并且您的代码根据“，”（逗号）拆分数据：-

String row = text.toString();
String [] values = row.trim().split(",");

考虑到您使用的是普通的 TextInputFormat。您的值将始终只有一个记录。

在那种情况下，reducer 的键将始终为空，即“”

所以调用

int len = str.length(); //7
char L = str.charAt(len - 1);

应该抛出 StringIndexOutOfBoundsException。

示例代码：-

String s = ""
int length = s.length()
Character c = s.charAt(length - 1)

我的建议是对代码进行适当的更改并添加必要的检查。

Answer 2

String str = "";
str = key.toString();   //sales-0    
int len = str.length(); //7   
char L = str.charAt(len - 1);

如果 key 是一个空文本那么 len = 0 所以 str.charAt(0-1 ) 是 str.charAt(-1) 它会导致 StringIndexOutOfBoundsException。所以请检查文本键是否为空。

无法在我的 MapReduce 代码中找到 String Index Out of Bound Exception 的原因

Unable to find the cause of String Index Out of Bound Exception in my MapReduce code

java

mapreduce