如何使用 Spark Cassandra Connector 保存 Java bean?
How do I save a Java bean using the Spark Cassandra Connector?
我已经阅读了 Spark 文档,但不确定如何使用 Spark Cassandra Connector 将 Java bean 保存到 table 中?
public class NewImageMetadataRow implements Serializable {
private final String merchant;
private final String productId;
private final String url;
private final int width;
private final int height;
public NewImageMetadataRow(NewImageMetadataRow row) {
this.merchant = row.getMerchant();
this.productId = row.getProductId();
this.url = row.getUrl();
this.width = row.getWidth();
this.height = row.getHeight();
}
public String getMerchant() {
return merchant;
}
public String getProductId() {
return productId;
}
public String getUrl() {
return url;
}
public int getWidth() {
return width;
}
public int getHeight() {
return height;
}
}
我有一个 RDD RDD[NewImageMetadataRow]
,我正试图像这样保存它:
myRDD.saveToCassandra(keyspace, "imagemetadatav3", SomeColumns("merchant", "productid", "url"))
这会导致此错误:
java.lang.IllegalArgumentException: requirement failed: Columns not found in com.mridang.image.NewImageMetadataRow: [merchant, productid, url]
at scala.Predef$.require(Predef.scala:281)
at com.datastax.spark.connector.mapper.DefaultColumnMapper.columnMapForWriting(DefaultColumnMapper.scala:106)
at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$$anon.<init>(MappedToGettableDataConverter.scala:35)
at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$.apply(MappedToGettableDataConverter.scala:26)
at com.datastax.spark.connector.writer.DefaultRowWriter.<init>(DefaultRowWriter.scala:16)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anon.rowWriter(DefaultRowWriter.scala:30)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anon.rowWriter(DefaultRowWriter.scala:28)
at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:423)
at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:35)
根据我的理解(和糟糕的 Scala foo),它似乎无法从 Java bean 中推断出 属性 名称。
另一个问题是我的 table 中的列名都是小写的,删除了空格和连字符,即 getter getProductId
对应的 Cassandra 列是 productid
.
(如果我使用 Jackson,我可以简单地添加 JsonProperty
注释。我想知道我可以用 Cassadra Mapper 做同样的事情。)
这花了一些时间,但结果是这样的:
val columns: RowWriterFactory[NewImageMetadataRow] =
CassandraJavaUtil.mapToRow(classOf[NewImageMetadataRow])
myRDD.saveToCassandra(keyspace, "imagemetadatav3")(CassandraConnector(sc), columns)
bean 中的字段需要 public 并使用 @CqlName
注释进行注释。
@CqlName("merchant")
public final String merchant;
我已经阅读了 Spark 文档,但不确定如何使用 Spark Cassandra Connector 将 Java bean 保存到 table 中?
public class NewImageMetadataRow implements Serializable {
private final String merchant;
private final String productId;
private final String url;
private final int width;
private final int height;
public NewImageMetadataRow(NewImageMetadataRow row) {
this.merchant = row.getMerchant();
this.productId = row.getProductId();
this.url = row.getUrl();
this.width = row.getWidth();
this.height = row.getHeight();
}
public String getMerchant() {
return merchant;
}
public String getProductId() {
return productId;
}
public String getUrl() {
return url;
}
public int getWidth() {
return width;
}
public int getHeight() {
return height;
}
}
我有一个 RDD RDD[NewImageMetadataRow]
,我正试图像这样保存它:
myRDD.saveToCassandra(keyspace, "imagemetadatav3", SomeColumns("merchant", "productid", "url"))
这会导致此错误:
java.lang.IllegalArgumentException: requirement failed: Columns not found in com.mridang.image.NewImageMetadataRow: [merchant, productid, url]
at scala.Predef$.require(Predef.scala:281)
at com.datastax.spark.connector.mapper.DefaultColumnMapper.columnMapForWriting(DefaultColumnMapper.scala:106)
at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$$anon.<init>(MappedToGettableDataConverter.scala:35)
at com.datastax.spark.connector.mapper.MappedToGettableDataConverter$.apply(MappedToGettableDataConverter.scala:26)
at com.datastax.spark.connector.writer.DefaultRowWriter.<init>(DefaultRowWriter.scala:16)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anon.rowWriter(DefaultRowWriter.scala:30)
at com.datastax.spark.connector.writer.DefaultRowWriter$$anon.rowWriter(DefaultRowWriter.scala:28)
at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:423)
at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:35)
根据我的理解(和糟糕的 Scala foo),它似乎无法从 Java bean 中推断出 属性 名称。
另一个问题是我的 table 中的列名都是小写的,删除了空格和连字符,即 getter getProductId
对应的 Cassandra 列是 productid
.
(如果我使用 Jackson,我可以简单地添加 JsonProperty
注释。我想知道我可以用 Cassadra Mapper 做同样的事情。)
这花了一些时间,但结果是这样的:
val columns: RowWriterFactory[NewImageMetadataRow] =
CassandraJavaUtil.mapToRow(classOf[NewImageMetadataRow])
myRDD.saveToCassandra(keyspace, "imagemetadatav3")(CassandraConnector(sc), columns)
bean 中的字段需要 public 并使用 @CqlName
注释进行注释。
@CqlName("merchant")
public final String merchant;