如何在使用 Google Cloud Dataflow 清除 Cloud Memorystore 中的缓存后插入数据?
How to insert data after clearing cache in Cloud Memorystore using Google Cloud Dataflow?
我正在做一个任务,如果dataflow要处理的输入文件有数据,就清除memorystore的缓存。这意味着,如果输入文件没有记录,则不会刷新 memorystore,但输入文件即使有一条记录,也应刷新 memorystore,然后处理输入文件。
我的数据流应用程序是一个多管道应用程序,它读取、处理数据,然后将数据存储在内存库中。管道正在成功执行。然而,内存存储的刷新工作正常,但刷新后,插入没有发生。
我写了一个函数,在检查输入文件是否有记录后刷新 memorystore。
FlushingMemorystore.java
package com.click.example.functions;
import afu.org.checkerframework.checker.nullness.qual.Nullable;
import com.google.auto.value.AutoValue;
import org.apache.beam.sdk.io.redis.RedisConnectionConfiguration;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.PTransform;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.values.PDone;
import org.apache.beam.vendor.grpc.v1p26p0.com.google.common.base.Preconditions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.Pipeline;
public class FlushingMemorystore {
private static final Logger LOGGER = LoggerFactory.getLogger(FlushingMemorystore.class);
public static FlushingMemorystore.Read read() {
return (new AutoValue_FlushingMemorystore_Read.Builder())
.setConnectionConfiguration(RedisConnectionConfiguration.create()).build();
}
@AutoValue
public abstract static class Read extends PTransform<PCollection<Long>, PDone> {
public Read() {
}
@Nullable
abstract RedisConnectionConfiguration connectionConfiguration();
@Nullable
abstract Long expireTime();
abstract FlushingMemorystore.Read.Builder toBuilder();
public FlushingMemorystore.Read withEndpoint(String host, int port) {
Preconditions.checkArgument(host != null, "host cannot be null");
Preconditions.checkArgument(port > 0, "port cannot be negative or 0");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withHost(host).withPort(port)).build();
}
public FlushingMemorystore.Read withAuth(String auth) {
Preconditions.checkArgument(auth != null, "auth cannot be null");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withAuth(auth)).build();
}
public FlushingMemorystore.Read withTimeout(int timeout) {
Preconditions.checkArgument(timeout >= 0, "timeout cannot be negative");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withTimeout(timeout)).build();
}
public FlushingMemorystore.Read withConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration) {
Preconditions.checkArgument(connectionConfiguration != null, "connection cannot be null");
return this.toBuilder().setConnectionConfiguration(connectionConfiguration).build();
}
public FlushingMemorystore.Read withExpireTime(Long expireTimeMillis) {
Preconditions.checkArgument(expireTimeMillis != null, "expireTimeMillis cannot be null");
Preconditions.checkArgument(expireTimeMillis > 0L, "expireTimeMillis cannot be negative or 0");
return this.toBuilder().setExpireTime(expireTimeMillis).build();
}
public PDone expand(PCollection<Long> input) {
Preconditions.checkArgument(this.connectionConfiguration() != null, "withConnectionConfiguration() is required");
input.apply(ParDo.of(new FlushingMemorystore.Read.ReadFn(this)));
return PDone.in(input.getPipeline());
}
private static class ReadFn extends DoFn<Long, String> {
private static final int DEFAULT_BATCH_SIZE = 1000;
private final FlushingMemorystore.Read spec;
private transient Jedis jedis;
private transient Pipeline pipeline;
private int batchCount;
public ReadFn(FlushingMemorystore.Read spec) {
this.spec = spec;
}
@Setup
public void setup() {
this.jedis = this.spec.connectionConfiguration().connect();
}
@StartBundle
public void startBundle() {
this.pipeline = this.jedis.pipelined();
this.pipeline.multi();
this.batchCount = 0;
}
@ProcessElement
public void processElement(DoFn<Long, String>.ProcessContext c) {
Long count = c.element();
batchCount++;
if(count==null && count < 0) {
LOGGER.info("No Records are there in the input file");
} else {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
}
LOGGER.info("*****The memorystore is flushed*****");
}
}
@FinishBundle
public void finishBundle() {
if (this.pipeline.isInMulti()) {
this.pipeline.exec();
this.pipeline.sync();
}
this.batchCount=0;
}
@Teardown
public void teardown() {
this.jedis.close();
}
}
@AutoValue.Builder
abstract static class Builder {
Builder() {
}
abstract FlushingMemorystore.Read.Builder setExpireTime(Long expireTimeMillis);
abstract FlushingMemorystore.Read build();
abstract FlushingMemorystore.Read.Builder setConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration);
}
}
}
我在 Starter Pipeline 代码中使用该函数。
正在使用函数的起始管道代码片段:
StorageToRedisOptions options = PipelineOptionsFactory.fromArgs(args)
.withValidation()
.as(StorageToRedisOptions.class);
Pipeline p = Pipeline.create(options);
PCollection<String> lines = p.apply(
"ReadLines", TextIO.read().from(options.getInputFile()));
/**
* Flushing the Memorystore if there are records in the input file
*/
lines.apply("Checking Data in input file", Count.globally())
.apply("Flushing the data store", FlushingMemorystore.read()
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
清除缓存后插入处理数据的代码片段:
dataset.apply(SOME_DATASET_TRANSFORMATION, RedisIO.write()
.withMethod(RedisIO.Write.Method.SADD)
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
数据流执行正常,它也刷新内存存储,但之后插入不起作用。你能指出我哪里出错了吗?
真正感谢任何解决问题的解决方案。提前致谢!
编辑:
根据评论中的要求提供额外信息
使用的运行时是 Java11,它使用 Apache Beam SDK for 2.24.0
如果输入文件有记录,它会用一些逻辑处理数据。例如,如果输入文件包含如下数据:
abcabc|Bruce|Wayne|2000
abbabb|Tony|Stark|3423
数据流会统计本例中记录的条数为2,并根据逻辑处理id、first name等,然后存储到memorystore中。该输入文件每天都会出现,因此,如果输入文件有记录,则应清除(或刷新)内存。
虽然管道没有中断,但我想我漏掉了一些东西。
我怀疑这里的问题是您需要确保“刷新”步骤在 RedisIO.write 步骤发生之前运行(并完成)。 Beam 有一个 Wait.on 转换,您可以使用它。
为此,我们可以使用刷新 PTransform 的输出作为我们已刷新数据库的信号 - 我们仅在完成刷新后写入数据库。 process
调用您的冲洗 DoFn 将如下所示:
@ProcessElement
public void processElement(DoFn<Long, String>.ProcessContext c) {
Long count = c.element();
if(count==null && count < 0) {
LOGGER.info("No Records are there in the input file");
} else {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
}
LOGGER.info("*****The memorystore is flushed*****");
}
c.output("READY");
}
一旦我们收到指示数据库已刷新的信号,我们就可以在向其写入新数据之前使用它来等待:
Pipeline p = Pipeline.create(options);
PCollection<String> lines = p.apply(
"ReadLines", TextIO.read().from(options.getInputFile()));
/**
* Flushing the Memorystore if there are records in the input file
*/
PCollection<String> flushedSignal = lines
.apply("Checking Data in input file", Count.globally())
.apply("Flushing the data store", FlushingMemorystore.read()
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
// Then we use the flushing signal to start writing to Redis:
dataset
.apply(Wait.on(flushedSignal))
.apply(SOME_DATASET_TRANSFORMATION, RedisIO.write()
.withMethod(RedisIO.Write.Method.SADD)
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
在我应用 Wait.on 转换后问题得到解决,因为 Pablo 的回答已经解释过了。但是,我不得不将我的 FlushingMemorystore.java 稍微重写为 PCollection 以获取 flushSignal 标志。
函数如下:
package com.click.example.functions;
import afu.org.checkerframework.checker.nullness.qual.Nullable;
import com.google.auto.value.AutoValue;
import org.apache.beam.sdk.io.redis.RedisConnectionConfiguration;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.PTransform;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.vendor.grpc.v1p26p0.com.google.common.base.Preconditions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.Pipeline;
public class FlushingMemorystore extends DoFn<Long, String> {
private static final Logger LOGGER = LoggerFactory.getLogger(FlushingMemorystore.class);
public static FlushingMemorystore.Read read() {
return (new AutoValue_FlushingMemorystore_Read.Builder())
.setConnectionConfiguration(RedisConnectionConfiguration.create()).build();
}
@AutoValue
public abstract static class Read extends PTransform<PCollection<Long>, PCollection<String>> {
public Read() {
}
@Nullable
abstract RedisConnectionConfiguration connectionConfiguration();
@Nullable
abstract Long expireTime();
abstract FlushingMemorystore.Read.Builder toBuilder();
public FlushingMemorystore.Read withEndpoint(String host, int port) {
Preconditions.checkArgument(host != null, "host cannot be null");
Preconditions.checkArgument(port > 0, "port cannot be negative or 0");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withHost(host).withPort(port)).build();
}
public FlushingMemorystore.Read withAuth(String auth) {
Preconditions.checkArgument(auth != null, "auth cannot be null");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withAuth(auth)).build();
}
public FlushingMemorystore.Read withTimeout(int timeout) {
Preconditions.checkArgument(timeout >= 0, "timeout cannot be negative");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withTimeout(timeout)).build();
}
public FlushingMemorystore.Read withConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration) {
Preconditions.checkArgument(connectionConfiguration != null, "connection cannot be null");
return this.toBuilder().setConnectionConfiguration(connectionConfiguration).build();
}
public FlushingMemorystore.Read withExpireTime(Long expireTimeMillis) {
Preconditions.checkArgument(expireTimeMillis != null, "expireTimeMillis cannot be null");
Preconditions.checkArgument(expireTimeMillis > 0L, "expireTimeMillis cannot be negative or 0");
return this.toBuilder().setExpireTime(expireTimeMillis).build();
}
public PCollection<String> expand(PCollection<Long> input) {
Preconditions.checkArgument(this.connectionConfiguration() != null, "withConnectionConfiguration() is required");
return input.apply(ParDo.of(new FlushingMemorystore.Read.ReadFn(this)));
}
@Setup
public Jedis setup() {
return this.connectionConfiguration().connect();
}
private static class ReadFn extends DoFn<Long, String> {
private static final int DEFAULT_BATCH_SIZE = 1000;
private final FlushingMemorystore.Read spec;
private transient Jedis jedis;
private transient Pipeline pipeline;
private int batchCount;
public ReadFn(FlushingMemorystore.Read spec) {
this.spec = spec;
}
@Setup
public void setup() {
this.jedis = this.spec.connectionConfiguration().connect();
}
@StartBundle
public void startBundle() {
this.pipeline = this.jedis.pipelined();
this.pipeline.multi();
this.batchCount = 0;
}
@ProcessElement
public void processElement(@Element Long count, OutputReceiver<String> out) {
batchCount++;
if(count!=null && count > 0) {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
LOGGER.info("*****The memorystore is flushed*****");
}
out.output("SUCCESS");
} else {
LOGGER.info("No Records are there in the input file");
out.output("FAILURE");
}
}
@FinishBundle
public void finishBundle() {
if (this.pipeline.isInMulti()) {
this.pipeline.exec();
this.pipeline.sync();
}
this.batchCount=0;
}
@Teardown
public void teardown() {
this.jedis.close();
}
}
@AutoValue.Builder
abstract static class Builder {
Builder() {
}
abstract FlushingMemorystore.Read.Builder setExpireTime(Long expireTimeMillis);
abstract FlushingMemorystore.Read build();
abstract FlushingMemorystore.Read.Builder setConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration);
}
}
}
我正在做一个任务,如果dataflow要处理的输入文件有数据,就清除memorystore的缓存。这意味着,如果输入文件没有记录,则不会刷新 memorystore,但输入文件即使有一条记录,也应刷新 memorystore,然后处理输入文件。
我的数据流应用程序是一个多管道应用程序,它读取、处理数据,然后将数据存储在内存库中。管道正在成功执行。然而,内存存储的刷新工作正常,但刷新后,插入没有发生。
我写了一个函数,在检查输入文件是否有记录后刷新 memorystore。
FlushingMemorystore.java
package com.click.example.functions;
import afu.org.checkerframework.checker.nullness.qual.Nullable;
import com.google.auto.value.AutoValue;
import org.apache.beam.sdk.io.redis.RedisConnectionConfiguration;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.PTransform;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.values.PDone;
import org.apache.beam.vendor.grpc.v1p26p0.com.google.common.base.Preconditions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.Pipeline;
public class FlushingMemorystore {
private static final Logger LOGGER = LoggerFactory.getLogger(FlushingMemorystore.class);
public static FlushingMemorystore.Read read() {
return (new AutoValue_FlushingMemorystore_Read.Builder())
.setConnectionConfiguration(RedisConnectionConfiguration.create()).build();
}
@AutoValue
public abstract static class Read extends PTransform<PCollection<Long>, PDone> {
public Read() {
}
@Nullable
abstract RedisConnectionConfiguration connectionConfiguration();
@Nullable
abstract Long expireTime();
abstract FlushingMemorystore.Read.Builder toBuilder();
public FlushingMemorystore.Read withEndpoint(String host, int port) {
Preconditions.checkArgument(host != null, "host cannot be null");
Preconditions.checkArgument(port > 0, "port cannot be negative or 0");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withHost(host).withPort(port)).build();
}
public FlushingMemorystore.Read withAuth(String auth) {
Preconditions.checkArgument(auth != null, "auth cannot be null");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withAuth(auth)).build();
}
public FlushingMemorystore.Read withTimeout(int timeout) {
Preconditions.checkArgument(timeout >= 0, "timeout cannot be negative");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withTimeout(timeout)).build();
}
public FlushingMemorystore.Read withConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration) {
Preconditions.checkArgument(connectionConfiguration != null, "connection cannot be null");
return this.toBuilder().setConnectionConfiguration(connectionConfiguration).build();
}
public FlushingMemorystore.Read withExpireTime(Long expireTimeMillis) {
Preconditions.checkArgument(expireTimeMillis != null, "expireTimeMillis cannot be null");
Preconditions.checkArgument(expireTimeMillis > 0L, "expireTimeMillis cannot be negative or 0");
return this.toBuilder().setExpireTime(expireTimeMillis).build();
}
public PDone expand(PCollection<Long> input) {
Preconditions.checkArgument(this.connectionConfiguration() != null, "withConnectionConfiguration() is required");
input.apply(ParDo.of(new FlushingMemorystore.Read.ReadFn(this)));
return PDone.in(input.getPipeline());
}
private static class ReadFn extends DoFn<Long, String> {
private static final int DEFAULT_BATCH_SIZE = 1000;
private final FlushingMemorystore.Read spec;
private transient Jedis jedis;
private transient Pipeline pipeline;
private int batchCount;
public ReadFn(FlushingMemorystore.Read spec) {
this.spec = spec;
}
@Setup
public void setup() {
this.jedis = this.spec.connectionConfiguration().connect();
}
@StartBundle
public void startBundle() {
this.pipeline = this.jedis.pipelined();
this.pipeline.multi();
this.batchCount = 0;
}
@ProcessElement
public void processElement(DoFn<Long, String>.ProcessContext c) {
Long count = c.element();
batchCount++;
if(count==null && count < 0) {
LOGGER.info("No Records are there in the input file");
} else {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
}
LOGGER.info("*****The memorystore is flushed*****");
}
}
@FinishBundle
public void finishBundle() {
if (this.pipeline.isInMulti()) {
this.pipeline.exec();
this.pipeline.sync();
}
this.batchCount=0;
}
@Teardown
public void teardown() {
this.jedis.close();
}
}
@AutoValue.Builder
abstract static class Builder {
Builder() {
}
abstract FlushingMemorystore.Read.Builder setExpireTime(Long expireTimeMillis);
abstract FlushingMemorystore.Read build();
abstract FlushingMemorystore.Read.Builder setConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration);
}
}
}
我在 Starter Pipeline 代码中使用该函数。
正在使用函数的起始管道代码片段:
StorageToRedisOptions options = PipelineOptionsFactory.fromArgs(args)
.withValidation()
.as(StorageToRedisOptions.class);
Pipeline p = Pipeline.create(options);
PCollection<String> lines = p.apply(
"ReadLines", TextIO.read().from(options.getInputFile()));
/**
* Flushing the Memorystore if there are records in the input file
*/
lines.apply("Checking Data in input file", Count.globally())
.apply("Flushing the data store", FlushingMemorystore.read()
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
清除缓存后插入处理数据的代码片段:
dataset.apply(SOME_DATASET_TRANSFORMATION, RedisIO.write()
.withMethod(RedisIO.Write.Method.SADD)
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
数据流执行正常,它也刷新内存存储,但之后插入不起作用。你能指出我哪里出错了吗? 真正感谢任何解决问题的解决方案。提前致谢!
编辑:
根据评论中的要求提供额外信息
使用的运行时是 Java11,它使用 Apache Beam SDK for 2.24.0
如果输入文件有记录,它会用一些逻辑处理数据。例如,如果输入文件包含如下数据:
abcabc|Bruce|Wayne|2000
abbabb|Tony|Stark|3423
数据流会统计本例中记录的条数为2,并根据逻辑处理id、first name等,然后存储到memorystore中。该输入文件每天都会出现,因此,如果输入文件有记录,则应清除(或刷新)内存。
虽然管道没有中断,但我想我漏掉了一些东西。
我怀疑这里的问题是您需要确保“刷新”步骤在 RedisIO.write 步骤发生之前运行(并完成)。 Beam 有一个 Wait.on 转换,您可以使用它。
为此,我们可以使用刷新 PTransform 的输出作为我们已刷新数据库的信号 - 我们仅在完成刷新后写入数据库。 process
调用您的冲洗 DoFn 将如下所示:
@ProcessElement
public void processElement(DoFn<Long, String>.ProcessContext c) {
Long count = c.element();
if(count==null && count < 0) {
LOGGER.info("No Records are there in the input file");
} else {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
}
LOGGER.info("*****The memorystore is flushed*****");
}
c.output("READY");
}
一旦我们收到指示数据库已刷新的信号,我们就可以在向其写入新数据之前使用它来等待:
Pipeline p = Pipeline.create(options);
PCollection<String> lines = p.apply(
"ReadLines", TextIO.read().from(options.getInputFile()));
/**
* Flushing the Memorystore if there are records in the input file
*/
PCollection<String> flushedSignal = lines
.apply("Checking Data in input file", Count.globally())
.apply("Flushing the data store", FlushingMemorystore.read()
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
// Then we use the flushing signal to start writing to Redis:
dataset
.apply(Wait.on(flushedSignal))
.apply(SOME_DATASET_TRANSFORMATION, RedisIO.write()
.withMethod(RedisIO.Write.Method.SADD)
.withConnectionConfiguration(RedisConnectionConfiguration
.create(options.getRedisHost(), options.getRedisPort())));
在我应用 Wait.on 转换后问题得到解决,因为 Pablo 的回答已经解释过了。但是,我不得不将我的 FlushingMemorystore.java 稍微重写为 PCollection 以获取 flushSignal 标志。
函数如下:
package com.click.example.functions;
import afu.org.checkerframework.checker.nullness.qual.Nullable;
import com.google.auto.value.AutoValue;
import org.apache.beam.sdk.io.redis.RedisConnectionConfiguration;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.PTransform;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.vendor.grpc.v1p26p0.com.google.common.base.Preconditions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.Pipeline;
public class FlushingMemorystore extends DoFn<Long, String> {
private static final Logger LOGGER = LoggerFactory.getLogger(FlushingMemorystore.class);
public static FlushingMemorystore.Read read() {
return (new AutoValue_FlushingMemorystore_Read.Builder())
.setConnectionConfiguration(RedisConnectionConfiguration.create()).build();
}
@AutoValue
public abstract static class Read extends PTransform<PCollection<Long>, PCollection<String>> {
public Read() {
}
@Nullable
abstract RedisConnectionConfiguration connectionConfiguration();
@Nullable
abstract Long expireTime();
abstract FlushingMemorystore.Read.Builder toBuilder();
public FlushingMemorystore.Read withEndpoint(String host, int port) {
Preconditions.checkArgument(host != null, "host cannot be null");
Preconditions.checkArgument(port > 0, "port cannot be negative or 0");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withHost(host).withPort(port)).build();
}
public FlushingMemorystore.Read withAuth(String auth) {
Preconditions.checkArgument(auth != null, "auth cannot be null");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withAuth(auth)).build();
}
public FlushingMemorystore.Read withTimeout(int timeout) {
Preconditions.checkArgument(timeout >= 0, "timeout cannot be negative");
return this.toBuilder().setConnectionConfiguration(this.connectionConfiguration().withTimeout(timeout)).build();
}
public FlushingMemorystore.Read withConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration) {
Preconditions.checkArgument(connectionConfiguration != null, "connection cannot be null");
return this.toBuilder().setConnectionConfiguration(connectionConfiguration).build();
}
public FlushingMemorystore.Read withExpireTime(Long expireTimeMillis) {
Preconditions.checkArgument(expireTimeMillis != null, "expireTimeMillis cannot be null");
Preconditions.checkArgument(expireTimeMillis > 0L, "expireTimeMillis cannot be negative or 0");
return this.toBuilder().setExpireTime(expireTimeMillis).build();
}
public PCollection<String> expand(PCollection<Long> input) {
Preconditions.checkArgument(this.connectionConfiguration() != null, "withConnectionConfiguration() is required");
return input.apply(ParDo.of(new FlushingMemorystore.Read.ReadFn(this)));
}
@Setup
public Jedis setup() {
return this.connectionConfiguration().connect();
}
private static class ReadFn extends DoFn<Long, String> {
private static final int DEFAULT_BATCH_SIZE = 1000;
private final FlushingMemorystore.Read spec;
private transient Jedis jedis;
private transient Pipeline pipeline;
private int batchCount;
public ReadFn(FlushingMemorystore.Read spec) {
this.spec = spec;
}
@Setup
public void setup() {
this.jedis = this.spec.connectionConfiguration().connect();
}
@StartBundle
public void startBundle() {
this.pipeline = this.jedis.pipelined();
this.pipeline.multi();
this.batchCount = 0;
}
@ProcessElement
public void processElement(@Element Long count, OutputReceiver<String> out) {
batchCount++;
if(count!=null && count > 0) {
if (pipeline.isInMulti()) {
pipeline.exec();
pipeline.sync();
jedis.flushDB();
LOGGER.info("*****The memorystore is flushed*****");
}
out.output("SUCCESS");
} else {
LOGGER.info("No Records are there in the input file");
out.output("FAILURE");
}
}
@FinishBundle
public void finishBundle() {
if (this.pipeline.isInMulti()) {
this.pipeline.exec();
this.pipeline.sync();
}
this.batchCount=0;
}
@Teardown
public void teardown() {
this.jedis.close();
}
}
@AutoValue.Builder
abstract static class Builder {
Builder() {
}
abstract FlushingMemorystore.Read.Builder setExpireTime(Long expireTimeMillis);
abstract FlushingMemorystore.Read build();
abstract FlushingMemorystore.Read.Builder setConnectionConfiguration(RedisConnectionConfiguration connectionConfiguration);
}
}
}