无法在具有首选区域的多区域部署中的 YugbyteDB 中创建 table

Can't create table in YugbyteDB in multi region deployment with preferred region

[用户在 YugabyteDB Community Slack 上发布的问题]

我在 3 个不同的区域部署了一个 3 master + 3 tserver 集群。我正在尝试设置一个首选区域。

即使每个 master 都在不同的区域,我也不得不使用 https://docs.yugabyte.com/latest/admin/yb-admin/#modify-placement-info 以便它出现在配置中。 然后我使用:https://docs.yugabyte.com/latest/admin/yb-admin/#set-preferred-zones 设置首选区域。

但是 tablets 并没有重新平衡领导者。还有什么事要做吗?

样本yb-tserver.conf

/home/yugabyte/bin/yb-tserver \
--fs_data_dirs=/mnt/disk0 \
--tserver_master_addrs={yb-master-0.yb-masters.yugabyte1.svc.cluster.local:7100,nyzks901i:29600},ldzks449i:29605,sgzks449i:29601 \
--placement_region=nyz-core-prod \
--use_private_ip=never \
--server_broadcast_addresses=nyzks902i:29500 \
--metric_node_name=yb-tserver-0 \
--memory_limit_hard_bytes=3649044480 \
--stderrthreshold=0 --num_cpus=0 \
--undefok=num_cpus,enable_ysql \
--rpc_bind_addresses=yb-tserver-0.yb-tservers.yugabyte1.svc.cluster.local \
--webserver_interface=0.0.0.0 \
--enable_ysql=true \
--pgsql_proxy_bind_address=0.0.0.0:5433 \
--cql_proxy_bind_address=yb-tserver-0.yb-tservers.yugabyte1.svc.cluster.local

样本yb-master.conf

/home/yugabyte/bin/yb-tserver \
--fs_data_dirs=/mnt/disk0 \
--tserver_master_addrs={yb-master-0.yb-masters.yugabyte1.svc.cluster.local:7100,nyzks901i:29600},ldzks449i:29605,sgzks449i:29601 \
--placement_region=nyz-core-prod \
--use_private_ip=never \
--server_broadcast_addresses=nyzks902i:29500 \
--metric_node_name=yb-tserver-0 \
--memory_limit_hard_bytes=3649044480 \
--stderrthreshold=0 \
--num_cpus=0 \
--undefok=num_cpus,enable_ysql \
--rpc_bind_addresses=yb-tserver-0.yb-tservers.yugabyte1.svc.cluster.local \
--webserver_interface=0.0.0.0 \
--enable_ysql=true \
--pgsql_proxy_bind_address=0.0.0.0:5433 \
--cql_proxy_bind_address=yb-tserver-0.yb-tservers.yugabyte1.svc.cluster.local

即使创建新的 table 也会导致错误:

cur.execute(
...   """
...   CREATE TABLE employee (id int PRIMARY KEY,
...                          name varchar,
...                          age int,
...                          language varchar)
...   """)
Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
psycopg2.errors.InternalError_: Invalid argument: Invalid table definition: Timed out waiting for Table Creation

并在出现上述错误后跟踪 yb-tserver 日志:

W0806 15:00:55.004124    63 catalog_manager.cc:7475] Aborting the current task due to error: Invalid argument (yb/master/catalog_manager.cc:7623): An error occurred while selecting replicas for tablet 0a503b0b9196425c956a8b9939b2c370: Invalid argument (yb/master/catalog_manager.cc:7623): Not enough tablet servers in the requested placements. Need at least 3, have 1: Not enough tablet servers in the requested placements. Need at least 3, have 1
E0806 15:00:55.004171    63 catalog_manager_bg_tasks.cc:142] Error processing pending assignments, aborting the current task: Invalid argument (yb/master/catalog_manager.cc:7623): An error occurred while selecting replicas for tablet 0a503b0b9196425c956a8b9939b2c370: Invalid argument (yb/master/catalog_manager.cc:7623): Not enough tablet servers in the requested placements. Need at least 3, have 1: Not enough tablet servers in the requested placements. Need at least 3, have 1
I0806 15:00:55.139647  2352 ysql_transaction_ddl.cc:46] Verifying Transaction { transaction_id: bb93b749-6b57-41b6-8f50-7461a07dc254 isolation: SNAPSHOT_ISOLATION status_tablet: 4069e18783a747bea31895b3ab6c69f6 priority: 1756854571847405073 start_time: { physical: 1628261964493502 } }
I0806 15:00:55.295994    53 ysql_transaction_ddl.cc:77] TransactionReceived: OK : status: PENDING
status_hybrid_time: 6669361378174894079
propagated_hybrid_time: 6669361378174910464
I0806 15:00:55.296051    53 ysql_transaction_ddl.cc:97] Got Response for { transaction_id: bb93b749-6b57-41b6-8f50-7461a07dc254 isolation: SNAPSHOT_ISOLATION status_tablet: 4069e18783a747bea31895b3ab6c69f6 priority: 1756854571847405073 start_time: { physical: 1628261964493502 } }: status: PENDING
status_hybrid_time: 6669361378174894079
propagated_hybrid_time: 6669361378174910464
W0806 15:00:55.310122    66 master_service_base-internal.h:39] Unknown master error in status: Invalid argument (yb/master/catalog_manager.cc:7623): An error occurred while selecting replicas for tablet 0a503b0b9196425c956a8b9939b2c370: Invalid argument (yb/master/catalog_manager.cc:7623): Not enough tablet servers in the requested placements. Need at least 3, have 1: Not enough tablet servers in the requested placements. Need at least 3, have 1
I0806 15:00:55.496171  2352 ysql_transaction_ddl.cc:46] Verifying Transaction { transaction_id: bb93b749-6b57-41b6-8f50-7461a07dc254 isolation: SNAPSHOT_ISOLATION status_tablet: 4069e18783a747bea31895b3ab6c69f6 priority: 1756854571847405073 start_time: { physical: 1628261964493502 } }
I0806 15:00:55.652542    67 ysql_transaction_ddl.cc:77] TransactionReceived: OK : status: PENDING

您还需要 yb-tserver 进程的 --placement_cloud=cloud--placement_zone=rack1 参数(以匹配您传递给 modify_placement_info 步骤的内容)。不仅仅是 --placement_region gflag。

否则,create table 步骤找不到匹配的 TServers 来满足 table 的所需位置。

除了 Dorian 的回答:请显示并验证完整的展示位置信息。您可以通过以下方式执行此操作:

  • 可以使用 curl http://MASTER-ADDRESS:7000/api/v1/masters | jq 找到主放置信息。
  • 可以在 http://MASTER-ADDRESS:7000/tablet-servers(通过浏览器)上找到 tserver 放置信息。
  • 使用 curl http://MASTER-ADDRESS:7000/cluster-config | jq
  • 的首选区域概览

这样,应该可以很容易地查看和验证是否一切都按预期设置了。