Spark 自定义 hadoop 输入格式 java 泛型错误
Spark custom hadoop input format java generics error
在尝试评估 Spark 以重用我们现有的来自 mapreduce 时代的自定义输入格式时,我遇到了一个 Java 泛型问题。
import com.google.protobuf.Message;
import com.twitter.elephantbird.mapreduce.io.ProtobufWritable;
public abstract class AbstractInputFormat<K extends Message, V> extends FileInputFormat<ProtobufWritable<K>, V>
...
import com.example.MyProto; // this extends Message
public class MyInputFormat extends AbstractInputFormat<MyProto, Text>
...
SparkConf conf = new SparkConf().setAppName("Test");
SparkContext sc = new SparkContext(conf);
JavaSparkContext jsc = JavaSparkContext.fromSparkContext(sc);
JavaPairRDD myRdd = jsc.newAPIHadoopFile(logFile, MyInputFormat.class, ProtobufWritable.class, Text.class,
Job.getInstance().getConfiguration());
以上导致 myRdd 出现以下错误
Bound mismatch: The generic method newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) of type JavaSparkContext is not applicable for the arguments (String, Class<MyInputFormat>, Class<ProtobufWritable>, Class<Text>, Configuration). The inferred type MyInputFormat is not a valid substitute for the bounded parameter <F extends InputFormat<K,V>>
不确定发生了什么。在我看来,我确实满足了界限?我无法发现问题?
This 是被调用的 Scala 代码。
以下更改对我有用
public class MyInputFormat<K extends Message> extends AbstractInputFormat<MyProto, Text>
public abstract class AbstractInputFormat<K extends Message, V> extends FileInputFormat<ProtobufWritable<K>, V>
在尝试评估 Spark 以重用我们现有的来自 mapreduce 时代的自定义输入格式时,我遇到了一个 Java 泛型问题。
import com.google.protobuf.Message;
import com.twitter.elephantbird.mapreduce.io.ProtobufWritable;
public abstract class AbstractInputFormat<K extends Message, V> extends FileInputFormat<ProtobufWritable<K>, V>
...
import com.example.MyProto; // this extends Message
public class MyInputFormat extends AbstractInputFormat<MyProto, Text>
...
SparkConf conf = new SparkConf().setAppName("Test");
SparkContext sc = new SparkContext(conf);
JavaSparkContext jsc = JavaSparkContext.fromSparkContext(sc);
JavaPairRDD myRdd = jsc.newAPIHadoopFile(logFile, MyInputFormat.class, ProtobufWritable.class, Text.class,
Job.getInstance().getConfiguration());
以上导致 myRdd 出现以下错误
Bound mismatch: The generic method newAPIHadoopFile(String, Class<F>, Class<K>, Class<V>, Configuration) of type JavaSparkContext is not applicable for the arguments (String, Class<MyInputFormat>, Class<ProtobufWritable>, Class<Text>, Configuration). The inferred type MyInputFormat is not a valid substitute for the bounded parameter <F extends InputFormat<K,V>>
不确定发生了什么。在我看来,我确实满足了界限?我无法发现问题?
This 是被调用的 Scala 代码。
以下更改对我有用
public class MyInputFormat<K extends Message> extends AbstractInputFormat<MyProto, Text>
public abstract class AbstractInputFormat<K extends Message, V> extends FileInputFormat<ProtobufWritable<K>, V>