dsbulk 卸载在大型 table 上失败
dsbulk unload is failing on large table
试图从一个巨大的table中卸载数据,下面是使用的命令和输出。
$ /home/cassandra/dsbulk-1.8.0/bin/dsbulk 卸载 --driver.auth.provider PlainTextAuthProvider --driver.auth.username xxxx --driver.auth.password xxxx --datastax-java-driver.basic.contact-points 123.123.123.123 -query "select count(*) from sometable with where on clustering column and partial pk -- allow filtering" --connector.name json --driver.protocol.compression LZ4 --connector.json.mode MULTI_DOCUMENT -maxConcurrentFiles 1 -maxRecords -1 -url dsbulk --executor.continuousPaging.enabled false --executor.maxpersecond 2500 --driver.socket.timeout 240000
Setting dsbulk.driver.protocol.compression is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.protocol.compression instead.
Setting dsbulk.driver.auth.* is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.auth-provider.* instead.
Operation directory: /home/cassandra/logs/COUNT_20210423-070104-108326
total | failed | rows/s | p50ms | p99ms | p999ms
1 | 1 | 0 | 109,790.10 | 110,058.54 | 110,058.54
Operation COUNT_20210423-070104-108326 completed with 1 errors in 1 minute and 50 seconds.
这里是 dsbulk 记录 --
cassandra@somehost> cd logs
cassandra@somehost> cd COUNT_20210423-070104-108326/
cassandra@somehost> ls
operation.log unload-errors.log
cassandra@somehost> cat operation.log
2021-04-23 07:01:04 WARN Setting dsbulk.driver.protocol.compression is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.protocol.compression instead.
2021-04-23 07:01:04 WARN Setting dsbulk.driver.auth.* is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.auth-provider.* instead.
2021-04-23 07:01:04 INFO Operation directory: /home/cassandra/logs/COUNT_20210423-070104-108326
2021-04-23 07:02:55 WARN Operation COUNT_20210423-070104-108326 completed with 1 errors in 1 minute and 50 seconds.
2021-04-23 07:02:55 INFO Records: total: 1, successful: 0, failed: 1
2021-04-23 07:02:55 INFO Memory usage: used: 212 MB, free: 1,922 MB, allocated: 2,135 MB, available: 27,305 MB, total gc count: 4, total gc time: 98 ms
2021-04-23 07:02:55 INFO Reads: total: 1, successful: 0, failed: 1, in-flight: 0
2021-04-23 07:02:55 INFO Throughput: 0 reads/second
2021-04-23 07:02:55 INFO Latencies: mean 109,790.10, 75p 110,058.54, 99p 110,058.54, 999p 110,058.54 milliseconds
2021-04-23 07:02:58 INFO Final stats:
2021-04-23 07:02:58 INFO Records: total: 1, successful: 0, failed: 1
2021-04-23 07:02:58 INFO Memory usage: used: 251 MB, free: 1,883 MB, allocated: 2,135 MB, available: 27,305 MB, total gc count: 4, total gc time: 98 ms
2021-04-23 07:02:58 INFO Reads: total: 1, successful: 0, failed: 1, in-flight: 0
2021-04-23 07:02:58 INFO Throughput: 0 reads/second
2021-04-23 07:02:58 INFO Latencies: mean 109,790.10, 75p 110,058.54, 99p 110,058.54, 999p 110,058.54 milliseconds
cassandra@somehost> cat unload-errors.log
Statement: com.datastax.oss.driver.internal.core.cql.DefaultBoundStatement@1083fef9 [0 values, idempotence: <UNSET>, CL: <UNSET>, serial CL: <UNSET>, timestamp: <UNSET>, timeout: <UNSET>]
SELECT batch_id from .... allow filtering (Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded))
at com.datastax.oss.dsbulk.executor.api.subscription.ResultSubscription.toErrorPage(ResultSubscription.java:534)
at com.datastax.oss.dsbulk.executor.api.subscription.ResultSubscription.lambda$fetchNextPage(ResultSubscription.java:372)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.setFinalError(CqlRequestHandler.java:447) [4 skipped]
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.access0(CqlRequestHandler.java:94)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.processRetryVerdict(CqlRequestHandler.java:859)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.processErrorResponse(CqlRequestHandler.java:828)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.onResponse(CqlRequestHandler.java:655)
at com.datastax.oss.driver.internal.core.channel.InFlightHandler.channelRead(InFlightHandler.java:257)
at java.lang.Thread.run(Thread.java:748) [24 skipped]
Caused by: com.datastax.oss.driver.api.core.servererrors.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
Cassandra 的 system.log 片段 ----
DEBUG [ScheduledTasks:1] 2021-04-23 00:01:48,539 MonitoringTask.java:152 - 1 operations timed out in the last 5015 msecs:
<SELECT * FROM my query being run with limit - LIMIT 5000>, total time 10004 msec, timeout 10000 msec/cross-node
INFO [ScheduledTasks:1] 2021-04-23 00:02:38,540 MessagingService.java:1302 - RANGE_SLICE messages were dropped in last 5000 ms: 0 internal and 1 cross node
. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 10299 ms
INFO [ScheduledTasks:1] 2021-04-23 00:02:38,551 StatusLogger.java:114 -
Pool Name Active Pending Completed Blocked All Time Blocked
ReadStage 1 0 1736872997 0 0
ContinuousPagingStage 0 0 586 0 0
RequestResponseStage 0 0 1483193130 0 0
ReadRepairStage 0 0 9079516 0 0
CounterMutationStage 0 0 0 0 0
MutationStage 0 0 351841038 0 0
ViewMutationStage 0 0 0 0 0
CommitLogArchiver 0 0 32961 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 12034828 0 0
MemtableReclaimMemory 0 0 68612 0 0
PendingRangeCalculator 0 0 9 0 0
AntiCompactionExecutor 0 0 0 0 0
GossipStage 0 0 20137208 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 3798 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 338955 0 0
PerDiskMemtableFlushWriter_0 0 0 66297 0 0
ValidationExecutor 0 0 247600 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 41757 0 0
InternalResponseStage 0 0 525242 0 0
AntiEntropyStage 0 0 767527 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 958717934 0 65
CompactionManager 0 0
MessagingService n/a 0/0
Cache Type Size Capacity KeysToSave
KeyCache 104857216 104857600 all
RowCache 0 0 all
使用令牌范围的附加条件扩展 select count(*) from sometable with where on clustering column and partial pk -- allow filtering
,如下所示:and partial pk token(full_pk) > :start and token(full_pk) <= :end
- 在这种情况下,DSBulk 将针对发送到多个节点的特定令牌范围执行许多查询,并且不会像您的情况那样在单个节点上创建负载。
调查documentation for -query option, and for 4th blog in this series of blog posts about DSBulk, that could provide more information & examples: 1, 2, 3, 4, 5, 6
问题是你是 运行 DSBulk 中的 unload
命令来执行 SELECT COUNT()
这意味着它必须执行完整的 table扫描到 return 一行。
此外,不推荐使用ALLOW FILTERING
,除非您将查询限制为单个分区。无论如何,即使在最佳情况下,ALLOW FILTERING
的性能也是非常不可预测的table。
我建议您改用 DSBulk count
命令,该命令针对 Cassandra 中的行数或分区数进行了优化。有关详细信息,请参阅 Counting data with DSBulk example.
此 DSBulk Counting blog post 中还有其他示例,Alex Ott 已在他的回答中链接了这些示例。干杯!
试图从一个巨大的table中卸载数据,下面是使用的命令和输出。
$ /home/cassandra/dsbulk-1.8.0/bin/dsbulk 卸载 --driver.auth.provider PlainTextAuthProvider --driver.auth.username xxxx --driver.auth.password xxxx --datastax-java-driver.basic.contact-points 123.123.123.123 -query "select count(*) from sometable with where on clustering column and partial pk -- allow filtering" --connector.name json --driver.protocol.compression LZ4 --connector.json.mode MULTI_DOCUMENT -maxConcurrentFiles 1 -maxRecords -1 -url dsbulk --executor.continuousPaging.enabled false --executor.maxpersecond 2500 --driver.socket.timeout 240000
Setting dsbulk.driver.protocol.compression is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.protocol.compression instead.
Setting dsbulk.driver.auth.* is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.auth-provider.* instead.
Operation directory: /home/cassandra/logs/COUNT_20210423-070104-108326
total | failed | rows/s | p50ms | p99ms | p999ms
1 | 1 | 0 | 109,790.10 | 110,058.54 | 110,058.54
Operation COUNT_20210423-070104-108326 completed with 1 errors in 1 minute and 50 seconds.
这里是 dsbulk 记录 --
cassandra@somehost> cd logs
cassandra@somehost> cd COUNT_20210423-070104-108326/
cassandra@somehost> ls
operation.log unload-errors.log
cassandra@somehost> cat operation.log
2021-04-23 07:01:04 WARN Setting dsbulk.driver.protocol.compression is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.protocol.compression instead.
2021-04-23 07:01:04 WARN Setting dsbulk.driver.auth.* is deprecated and will be removed in a future release; please configure the driver directly using --datastax-java-driver.advanced.auth-provider.* instead.
2021-04-23 07:01:04 INFO Operation directory: /home/cassandra/logs/COUNT_20210423-070104-108326
2021-04-23 07:02:55 WARN Operation COUNT_20210423-070104-108326 completed with 1 errors in 1 minute and 50 seconds.
2021-04-23 07:02:55 INFO Records: total: 1, successful: 0, failed: 1
2021-04-23 07:02:55 INFO Memory usage: used: 212 MB, free: 1,922 MB, allocated: 2,135 MB, available: 27,305 MB, total gc count: 4, total gc time: 98 ms
2021-04-23 07:02:55 INFO Reads: total: 1, successful: 0, failed: 1, in-flight: 0
2021-04-23 07:02:55 INFO Throughput: 0 reads/second
2021-04-23 07:02:55 INFO Latencies: mean 109,790.10, 75p 110,058.54, 99p 110,058.54, 999p 110,058.54 milliseconds
2021-04-23 07:02:58 INFO Final stats:
2021-04-23 07:02:58 INFO Records: total: 1, successful: 0, failed: 1
2021-04-23 07:02:58 INFO Memory usage: used: 251 MB, free: 1,883 MB, allocated: 2,135 MB, available: 27,305 MB, total gc count: 4, total gc time: 98 ms
2021-04-23 07:02:58 INFO Reads: total: 1, successful: 0, failed: 1, in-flight: 0
2021-04-23 07:02:58 INFO Throughput: 0 reads/second
2021-04-23 07:02:58 INFO Latencies: mean 109,790.10, 75p 110,058.54, 99p 110,058.54, 999p 110,058.54 milliseconds
cassandra@somehost> cat unload-errors.log
Statement: com.datastax.oss.driver.internal.core.cql.DefaultBoundStatement@1083fef9 [0 values, idempotence: <UNSET>, CL: <UNSET>, serial CL: <UNSET>, timestamp: <UNSET>, timeout: <UNSET>]
SELECT batch_id from .... allow filtering (Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded))
at com.datastax.oss.dsbulk.executor.api.subscription.ResultSubscription.toErrorPage(ResultSubscription.java:534)
at com.datastax.oss.dsbulk.executor.api.subscription.ResultSubscription.lambda$fetchNextPage(ResultSubscription.java:372)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.setFinalError(CqlRequestHandler.java:447) [4 skipped]
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.access0(CqlRequestHandler.java:94)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.processRetryVerdict(CqlRequestHandler.java:859)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.processErrorResponse(CqlRequestHandler.java:828)
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler$NodeResponseCallback.onResponse(CqlRequestHandler.java:655)
at com.datastax.oss.driver.internal.core.channel.InFlightHandler.channelRead(InFlightHandler.java:257)
at java.lang.Thread.run(Thread.java:748) [24 skipped]
Caused by: com.datastax.oss.driver.api.core.servererrors.ReadTimeoutException: Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but only 0 replica responded)
Cassandra 的 system.log 片段 ----
DEBUG [ScheduledTasks:1] 2021-04-23 00:01:48,539 MonitoringTask.java:152 - 1 operations timed out in the last 5015 msecs:
<SELECT * FROM my query being run with limit - LIMIT 5000>, total time 10004 msec, timeout 10000 msec/cross-node
INFO [ScheduledTasks:1] 2021-04-23 00:02:38,540 MessagingService.java:1302 - RANGE_SLICE messages were dropped in last 5000 ms: 0 internal and 1 cross node
. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 10299 ms
INFO [ScheduledTasks:1] 2021-04-23 00:02:38,551 StatusLogger.java:114 -
Pool Name Active Pending Completed Blocked All Time Blocked
ReadStage 1 0 1736872997 0 0
ContinuousPagingStage 0 0 586 0 0
RequestResponseStage 0 0 1483193130 0 0
ReadRepairStage 0 0 9079516 0 0
CounterMutationStage 0 0 0 0 0
MutationStage 0 0 351841038 0 0
ViewMutationStage 0 0 0 0 0
CommitLogArchiver 0 0 32961 0 0
MiscStage 0 0 0 0 0
CompactionExecutor 0 0 12034828 0 0
MemtableReclaimMemory 0 0 68612 0 0
PendingRangeCalculator 0 0 9 0 0
AntiCompactionExecutor 0 0 0 0 0
GossipStage 0 0 20137208 0 0
SecondaryIndexManagement 0 0 0 0 0
HintsDispatcher 0 0 3798 0 0
MigrationStage 0 0 8 0 0
MemtablePostFlush 0 0 338955 0 0
PerDiskMemtableFlushWriter_0 0 0 66297 0 0
ValidationExecutor 0 0 247600 0 0
Sampler 0 0 0 0 0
MemtableFlushWriter 0 0 41757 0 0
InternalResponseStage 0 0 525242 0 0
AntiEntropyStage 0 0 767527 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 0 0 958717934 0 65
CompactionManager 0 0
MessagingService n/a 0/0
Cache Type Size Capacity KeysToSave
KeyCache 104857216 104857600 all
RowCache 0 0 all
使用令牌范围的附加条件扩展 select count(*) from sometable with where on clustering column and partial pk -- allow filtering
,如下所示:and partial pk token(full_pk) > :start and token(full_pk) <= :end
- 在这种情况下,DSBulk 将针对发送到多个节点的特定令牌范围执行许多查询,并且不会像您的情况那样在单个节点上创建负载。
调查documentation for -query option, and for 4th blog in this series of blog posts about DSBulk, that could provide more information & examples: 1, 2, 3, 4, 5, 6
问题是你是 运行 DSBulk 中的 unload
命令来执行 SELECT COUNT()
这意味着它必须执行完整的 table扫描到 return 一行。
此外,不推荐使用ALLOW FILTERING
,除非您将查询限制为单个分区。无论如何,即使在最佳情况下,ALLOW FILTERING
的性能也是非常不可预测的table。
我建议您改用 DSBulk count
命令,该命令针对 Cassandra 中的行数或分区数进行了优化。有关详细信息,请参阅 Counting data with DSBulk example.
此 DSBulk Counting blog post 中还有其他示例,Alex Ott 已在他的回答中链接了这些示例。干杯!