清除未使用的 Cassandra 目录的最佳方法是什么
What is the best way to clear unused Cassandra directories
为什么cassandra的gc在compaction的时候没有删除column family不用的目录?我怎样才能安全地删除它们?
我有一个 5 节点的 Cassandra 集群:
# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.97.18.21 5.13 GiB 256 60.4% 8a6828d8-db43-4722-82fd-dd37ec1c25a1 rack1
UN 10.97.18.23 7.53 GiB 256 60.4% adb18dfd-3cef-4ae3-9766-1e3f17d68588 rack1
UN 10.97.18.22 8.3 GiB 256 62.8% 1d6c453a-e3fb-4b3b-b7c1-689e7c8fbbbb rack1
UN 10.97.18.25 5.1 GiB 256 60.1% c8e4a4dc-4a05-4bac-b4d2-669fae9282b0 rack1
UN 10.97.18.24 7.97 GiB 256 56.3% f2732a23-b70a-41a5-aaaa-1be95002ee8a rack1
我有一个键空间 'loan_products' 只有一个列族 'events':
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>
cqlsh> DESCRIBE KEYSPACE loan_products ;
CREATE KEYSPACE loan_products WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE loan_products.events (
persistence_id text,
partition_nr bigint,
sequence_nr bigint,
timestamp timeuuid,
timebucket text,
event blob,
event_manifest text,
message blob,
meta blob,
meta_ser_id int,
meta_ser_manifest text,
ser_id int,
ser_manifest text,
tag1 text,
tag2 text,
tag3 text,
used boolean static,
writer_uuid text,
PRIMARY KEY ((persistence_id, partition_nr), sequence_nr, timestamp, timebucket)
) WITH CLUSTERING ORDER BY (sequence_nr ASC, timestamp ASC, timebucket ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
我根本没有快照:
# nodetool listsnapshots
Snapshot Details:
There are no snapshots
列族有默认值 gc_grace_seconds = 864000(10 天),因此 gc 必须删除墓碑等,但它们仍然存在于文件系统中。并行 ssh 显示:
[1] 11:50:34 [SUCCESS] 10.97.18.21
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:02 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:46 events-c156cc40e65111e7a2863103117dd196
[2] 11:50:34 [SUCCESS] 10.97.18.22
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:49 events-c156cc40e65111e7a2863103117dd196
[3] 11:50:34 [SUCCESS] 10.97.18.23
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:07 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:48 events-c156cc40e65111e7a2863103117dd196
[4] 11:50:34 [SUCCESS] 10.97.18.25
total 20
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:45 events-c156cc40e65111e7a2863103117dd196
[5] 11:50:34 [SUCCESS] 10.97.18.24
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:50 events-c156cc40e65111e7a2863103117dd196
因为我看到只有一个 ID 为 c156cc40e65111e7a2863103117dd196 的目录正在使用中,最后一次更新是在 1 月 15 日
默认情况下,每当删除列族时,Cassandra 都会拍摄快照。这是为了防止意外截断(删除 table 中的所有记录)或意外删除 table。 Cassandra.yaml 中控制这个的参数是 auto_snapshot
Whether or not a snapshot is taken of the data before keyspace truncation
or dropping of column families. The STRONGLY advised default of true
should be used to provide data safety. If you set this flag to false, you will
lose data on truncation or drop.
auto_snapshot: true
因此根据您显示的屏幕截图,看起来 "events" table 至少被删除了 4 次并重新创建。因此,清理它的正确方法是首先找出 Cassandra 在键空间中为给定的 table 使用的正确 UUID。在您的情况下,查询将是
select id from system_schema.tables where keyspace_name = 'loan_products' and table_name = 'events' ;
现在通过 "rm -rf" 手动删除其他 table 目录,因为 UUID 在上面的输出中不对应。
也是 "nodetool listsnapshots" 没有提供任何快照的原因,因为活跃的 table 没有任何快照。但是,如果您转到其他 4 个 "events" table 目录中的任何一个并执行 "ls -ltr" 您应该能够在其中找到快照目录,这些目录被创建为 table被丢弃了。
为什么cassandra的gc在compaction的时候没有删除column family不用的目录?我怎样才能安全地删除它们?
我有一个 5 节点的 Cassandra 集群:
# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.97.18.21 5.13 GiB 256 60.4% 8a6828d8-db43-4722-82fd-dd37ec1c25a1 rack1
UN 10.97.18.23 7.53 GiB 256 60.4% adb18dfd-3cef-4ae3-9766-1e3f17d68588 rack1
UN 10.97.18.22 8.3 GiB 256 62.8% 1d6c453a-e3fb-4b3b-b7c1-689e7c8fbbbb rack1
UN 10.97.18.25 5.1 GiB 256 60.1% c8e4a4dc-4a05-4bac-b4d2-669fae9282b0 rack1
UN 10.97.18.24 7.97 GiB 256 56.3% f2732a23-b70a-41a5-aaaa-1be95002ee8a rack1
我有一个键空间 'loan_products' 只有一个列族 'events':
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>
cqlsh> DESCRIBE KEYSPACE loan_products ;
CREATE KEYSPACE loan_products WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE loan_products.events (
persistence_id text,
partition_nr bigint,
sequence_nr bigint,
timestamp timeuuid,
timebucket text,
event blob,
event_manifest text,
message blob,
meta blob,
meta_ser_id int,
meta_ser_manifest text,
ser_id int,
ser_manifest text,
tag1 text,
tag2 text,
tag3 text,
used boolean static,
writer_uuid text,
PRIMARY KEY ((persistence_id, partition_nr), sequence_nr, timestamp, timebucket)
) WITH CLUSTERING ORDER BY (sequence_nr ASC, timestamp ASC, timebucket ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
我根本没有快照:
# nodetool listsnapshots
Snapshot Details:
There are no snapshots
列族有默认值 gc_grace_seconds = 864000(10 天),因此 gc 必须删除墓碑等,但它们仍然存在于文件系统中。并行 ssh 显示:
[1] 11:50:34 [SUCCESS] 10.97.18.21
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:02 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:46 events-c156cc40e65111e7a2863103117dd196
[2] 11:50:34 [SUCCESS] 10.97.18.22
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:49 events-c156cc40e65111e7a2863103117dd196
[3] 11:50:34 [SUCCESS] 10.97.18.23
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:07 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:48 events-c156cc40e65111e7a2863103117dd196
[4] 11:50:34 [SUCCESS] 10.97.18.25
total 20
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 9 15:08 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:45 events-c156cc40e65111e7a2863103117dd196
[5] 11:50:34 [SUCCESS] 10.97.18.24
total 20
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:00 events-a83b3be0e61711e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 13:01 events-bbedb500e61c11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:08 events-48c2b750e61d11e7a2863103117dd196
drwxr-xr-x. 4 cassandra cassandra 4096 дек 21 19:19 events-16c0b670e65011e7a2863103117dd196
drwxr-xr-x. 3 cassandra cassandra 4096 янв 15 11:50 events-c156cc40e65111e7a2863103117dd196
因为我看到只有一个 ID 为 c156cc40e65111e7a2863103117dd196 的目录正在使用中,最后一次更新是在 1 月 15 日
默认情况下,每当删除列族时,Cassandra 都会拍摄快照。这是为了防止意外截断(删除 table 中的所有记录)或意外删除 table。 Cassandra.yaml 中控制这个的参数是 auto_snapshot
Whether or not a snapshot is taken of the data before keyspace truncation or dropping of column families. The STRONGLY advised default of true should be used to provide data safety. If you set this flag to false, you will lose data on truncation or drop. auto_snapshot: true
因此根据您显示的屏幕截图,看起来 "events" table 至少被删除了 4 次并重新创建。因此,清理它的正确方法是首先找出 Cassandra 在键空间中为给定的 table 使用的正确 UUID。在您的情况下,查询将是
select id from system_schema.tables where keyspace_name = 'loan_products' and table_name = 'events' ;
现在通过 "rm -rf" 手动删除其他 table 目录,因为 UUID 在上面的输出中不对应。
也是 "nodetool listsnapshots" 没有提供任何快照的原因,因为活跃的 table 没有任何快照。但是,如果您转到其他 4 个 "events" table 目录中的任何一个并执行 "ls -ltr" 您应该能够在其中找到快照目录,这些目录被创建为 table被丢弃了。