比较 Java 中的两个 Spark 模式,无法将 Seq<StructField> 转换为 List<StructField>
Compare two Spark Schemas in Java, Unable to Cast Seq<StructField> to List<StructField>
问题:我想以 DDL 格式获取两个模式之间的公共属性。
我有以下工作代码来获取 Scala 中模式的交集:
val diff = df1.schema.intersect(df2.schema)
val sb = new StringBuilder();
diff.toStream.foreach(x => sb.append( x.toDDL + ", "))
但是我在将这篇文章转换为 Java 时遇到了转换问题:
StructType s1 = new StructType().add("col1",StringType)
.add("col2",StringType)
.add("col3",StringType)
.add("col4",StringType);
StructType s2 = new StructType().add("col1",StringType)
.add("col4",StringType);
System.out.println("Output :" + s1.toList().intersect(s2.toList()));
Output :List(StructField(col1,StringType,true), StructField(col4,StringType,true))
我无法将此输出转换为 DDL。我尝试将上面的对象读取为 Seq,但它因编译错误而失败:
Seq<StructField> result = s1.toList().intersect(s2.toList());
Error: java: incompatible types: java.lang.Object cannot be converted to scala.collection.Seq<org.apache.spark.sql.types.StructField>
再试一次:
StringBuilder sb = new StringBuilder();
s1.toList().intersect(s2.toList()).foreach( (schema) -> sb.append(schema.toDDL() + ","));
Error:(81, 39) java: cannot find symbol
symbol: method foreach((schema)->[...] ","))
location: class java.lang.Object
关于如何将其读作 List<StructType>
以便我可以将其转换为 DDL 的任何指示?
我知道的唯一方法是使用 JavaConversions
,比如
Object something = s1.toList().intersect(s2.toList());
List<StructField> result = JavaConversions.seqAsJavaList((Seq<StructField>)something);
System.out.println("Output :" + result);
...这将打印
Output :[StructField(col1,StringType,true), StructField(col4,StringType,true)]
问题:我想以 DDL 格式获取两个模式之间的公共属性。
我有以下工作代码来获取 Scala 中模式的交集:
val diff = df1.schema.intersect(df2.schema)
val sb = new StringBuilder();
diff.toStream.foreach(x => sb.append( x.toDDL + ", "))
但是我在将这篇文章转换为 Java 时遇到了转换问题:
StructType s1 = new StructType().add("col1",StringType)
.add("col2",StringType)
.add("col3",StringType)
.add("col4",StringType);
StructType s2 = new StructType().add("col1",StringType)
.add("col4",StringType);
System.out.println("Output :" + s1.toList().intersect(s2.toList()));
Output :List(StructField(col1,StringType,true), StructField(col4,StringType,true))
我无法将此输出转换为 DDL。我尝试将上面的对象读取为 Seq,但它因编译错误而失败:
Seq<StructField> result = s1.toList().intersect(s2.toList());
Error: java: incompatible types: java.lang.Object cannot be converted to scala.collection.Seq<org.apache.spark.sql.types.StructField>
再试一次:
StringBuilder sb = new StringBuilder();
s1.toList().intersect(s2.toList()).foreach( (schema) -> sb.append(schema.toDDL() + ","));
Error:(81, 39) java: cannot find symbol
symbol: method foreach((schema)->[...] ","))
location: class java.lang.Object
关于如何将其读作 List<StructType>
以便我可以将其转换为 DDL 的任何指示?
我知道的唯一方法是使用 JavaConversions
,比如
Object something = s1.toList().intersect(s2.toList());
List<StructField> result = JavaConversions.seqAsJavaList((Seq<StructField>)something);
System.out.println("Output :" + result);
...这将打印
Output :[StructField(col1,StringType,true), StructField(col4,StringType,true)]