Kinesis GetShardIterator...无效,因为它不是来自此流
Kinesis GetShardIterator... invalid because it did not come from this stream
我已经构建了一个 KCL plus spark 基于
https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
我在 EMR 上 运行 这个(通过 bootstrap 安装的 spark)。我在流上创建了 sparkTest 并测试了它的工作正常。我观察到没有创建 DynamoDB。
我删除了 stream 和 cluster 。第二天,我再次创建了同名的 Kinesis Steam,并使用新启动的集群部署了我的代码。
现在我得到
5/06/12 08:17:28 ERROR worker.InitializeTask: Caught exception:
com.amazonaws.services.kinesis.model.InvalidArgumentException: StartingSequenceNumber 49551532098093284204238000035066183240246145871536717826 used in GetShardIterator on shard shardId-000000000000 in stream sparkTest under account 618673372431 is invalid because it did not come from this stream. (Service: AmazonKinesis; Status Code: 400; Error Code: InvalidArgumentException; Request ID: 770ef875-10db-11e5-b24b-af6f372168ae)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClie
我不明白为什么会这样。如果我创建新的运动流,然后开始工作,它会再次工作。
Kinesis 有问题吗?
还有一个话题是关于这个的
https://github.com/awslabs/amazon-kinesis-connectors/issues/8
但是我没有使用 kinesis 应用程序名称并使用
创建流
KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
SparkConf sparkConfig = new SparkConf().setAppName("arbitraryName").setMaster("local[2]");
KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2()));
如果我更改名称 "arbitraryName"。它工作正常。我从
https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
key points:
The application name used in the streaming context becomes the Kinesis application name
The application name must be unique for a given account and region.
我已经构建了一个 KCL plus spark 基于 https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
我在 EMR 上 运行 这个(通过 bootstrap 安装的 spark)。我在流上创建了 sparkTest 并测试了它的工作正常。我观察到没有创建 DynamoDB。 我删除了 stream 和 cluster 。第二天,我再次创建了同名的 Kinesis Steam,并使用新启动的集群部署了我的代码。 现在我得到
5/06/12 08:17:28 ERROR worker.InitializeTask: Caught exception:
com.amazonaws.services.kinesis.model.InvalidArgumentException: StartingSequenceNumber 49551532098093284204238000035066183240246145871536717826 used in GetShardIterator on shard shardId-000000000000 in stream sparkTest under account 618673372431 is invalid because it did not come from this stream. (Service: AmazonKinesis; Status Code: 400; Error Code: InvalidArgumentException; Request ID: 770ef875-10db-11e5-b24b-af6f372168ae)
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClie
我不明白为什么会这样。如果我创建新的运动流,然后开始工作,它会再次工作。 Kinesis 有问题吗?
还有一个话题是关于这个的 https://github.com/awslabs/amazon-kinesis-connectors/issues/8 但是我没有使用 kinesis 应用程序名称并使用
创建流KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2())
SparkConf sparkConfig = new SparkConf().setAppName("arbitraryName").setMaster("local[2]");
KinesisUtils.createStream(
jssc, streamName, endpointUrl, kinesisCheckpointInterval, InitialPositionInStream.LATEST, StorageLevel.MEMORY_AND_DISK_2()));
如果我更改名称 "arbitraryName"。它工作正常。我从 https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
key points:
The application name used in the streaming context becomes the Kinesis application name
The application name must be unique for a given account and region.