Memsql 正在花费大量时间来执行查询
Memsql is taking lot of time for a query to execute
当我 运行 获取 51M 记录的查询 returns 在 5 分钟内得到结果时,我的 memsql 集群出现问题
但是当数据插入与读取并行时,它过去需要超过 15 分钟。
我测量了磁盘io,没问题,磁盘是hdd磁盘。
没有其他连接到 memsql,cpu 64 核机器的利用率也是 15%
下面是我的变量
Variable_name | Value |
+----------------------------------------------+------------------------------------------------------------------------------+
| aggregator_failure_detection | ON |
| auto_replicate | OFF |
| autocommit | ON |
| basedir | /data/master-3306 |
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_sets_dir | /data/master-3306/share/charsets/ |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
| columnar_segment_rows | 102400 |
| columnstore_window_size | 2147483648 |
| compile_only | OFF |
| connect_timeout | 10 |
| core_file | ON |
| core_file_mode | PARTIAL |
| critical_diagnostics | ON |
| datadir | /data/master-3306/data |
| default_partitions_per_leaf | 16 |
| enable_experimental_metrics | OFF |
| error_count | 0 |
| explain_expression_limit | 500 |
| external_user | |
| flush_before_replicate | OFF |
| general_log | OFF |
| geo_sphere_radius | 6367444.657120 |
| hostname | **** |
| identity | 0 |
| kerberos_server_keytab | |
| lc_messages | en_US |
| lc_messages_dir | /data/master-3306/share |
| leaf_failure_detection | ON |
| load_data_max_buffer_size | 1073741823 |
| load_data_read_size | 8192 |
| load_data_write_size | 8192 |
| lock_wait_timeout | 60 |
| master_aggregator | self |
| max_allowed_packet | 104857600 |
| max_connection_threads | 192 |
| max_connections | 100000 |
| max_pooled_connections | 4096 |
| max_prefetch_threads | 1 |
| max_prepared_stmt_count | 16382 |
| max_user_connections | 0 |
| maximum_memory | 506602 |
| maximum_table_memory | 455941 |
| memsql_id | ** |
| memsql_version | 5.7.2 |
| memsql_version_date | Thu Jan 26 12:34:22 2017 -0800 |
| memsql_version_hash | 03e5e3581e96d65caa30756f191323437a3840f0 |
| minimal_disk_space | 100 |
| multi_insert_tuple_count | 20000 |
| net_buffer_length | 102400 |
| net_read_timeout | 3600 |
| net_write_timeout | 3600 |
| pid_file | /data/master-3306/data/memsqld.pid |
| pipelines_batches_metadata_to_keep | 1000 |
| pipelines_extractor_debug_logging | OFF |
| pipelines_kafka_version | 0.8.2.2 |
| pipelines_max_errors_per_partition | 1000 |
| pipelines_max_offsets_per_batch_partition | 1000000 |
| pipelines_max_retries_per_batch_partition | 4 |
| pipelines_stderr_bufsize | 65535 |
| pipelines_stop_on_error | ON |
| plan_expiration_minutes | 720 |
| port | 3306 |
| protocol_version | 10 |
| proxy_user | |
| query_parallelism | 0 |
| redundancy_level | 1 |
| reported_hostname | |
| secure_file_priv | |
| show_query_parameters | ON |
| skip_name_resolve | AUTO |
| snapshot_trigger_size | 268435456 |
| snapshots_to_keep | 2 |
| socket | /data/master-3306/data/memsql.sock |
| sql_quote_show_create | ON |
| ssl_ca | |
| ssl_capath | |
| ssl_cert | |
| ssl_cipher | |
| ssl_key | |
| sync_slave_timeout | 20000 |
| system_time_zone | UTC |
| thread_cache_size | 0 |
| thread_handling | one-thread-per-connection |
| thread_stack | 1048576 |
| time_zone | SYSTEM |
| timestamp | 1504799067.127069 |
| tls_version | TLSv1,TLSv1.1,TLSv1.2 |
| tmpdir | . |
| transaction_buffer | 67108864 |
| tx_isolation | READ-COMMITTED |
| use_join_bucket_bit_vector | ON |
| use_vectorized_join | ON |
| version | 5.5.8 |
| version_comment | MemSQL source distribution (compatible; MySQL Enterprise & MySQL Commercial) |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
| warn_level | WARNINGS |
| warning_count | 0 |
| workload_management | ON |
| workload_management_expected_aggregators | 1 |
| workload_management_max_connections_per_leaf | 1024 |
| workload_management_max_queue_depth | 100 |
| workload_management_max_threads_per_leaf | 8192 |
| workload_management_queue_time_warning_ratio | 0.500000 |
| workload_management_queue_timeout | 3600
一些想法:
- 工作负载分析 (https://docs.memsql.com/concepts/v5.8/workload-profiling-overview/) 可以帮助您了解哪些资源限制了此查询的速度 - 如果不是 cpu 也不是磁盘 io,可能是网络 io 或其他东西
- 查询分析 (https://docs.memsql.com/sql-reference/v5.8/profile/) 将有助于指出查询中的哪些运算符是昂贵的
- 你提到你的机器有 64 个核心 - 你的集群是否正确地进行了 numa 优化? 运行
memsql-ops memsql-optimize
检查。
Jack,我做了工作量分析,发现当 NETWORK_LOGICAL_SEND_B 高
时 lock_time_ms 相当高
LAST_FINISHED_TIMESTAMP, LOCK_TIME_MS, LOCK_ROW_TIME_MS, ACTIVITY_NAME, NETWORK_LOGICAL_RECV_B, NETWORK_LOGICAL_SEND_B, activity_name, database_name, partition_id, 左(q.query_text, 50)
'2017-09-11 10:41:31', '39988', '0', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', '0', '538154753', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', 'datawarehouse', '7', 'SELECT \n combined
.day
AS day,\n `lineitem'
当我 运行 获取 51M 记录的查询 returns 在 5 分钟内得到结果时,我的 memsql 集群出现问题 但是当数据插入与读取并行时,它过去需要超过 15 分钟。
我测量了磁盘io,没问题,磁盘是hdd磁盘。
没有其他连接到 memsql,cpu 64 核机器的利用率也是 15%
下面是我的变量
Variable_name | Value |
+----------------------------------------------+------------------------------------------------------------------------------+
| aggregator_failure_detection | ON |
| auto_replicate | OFF |
| autocommit | ON |
| basedir | /data/master-3306 |
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_sets_dir | /data/master-3306/share/charsets/ |
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
| columnar_segment_rows | 102400 |
| columnstore_window_size | 2147483648 |
| compile_only | OFF |
| connect_timeout | 10 |
| core_file | ON |
| core_file_mode | PARTIAL |
| critical_diagnostics | ON |
| datadir | /data/master-3306/data |
| default_partitions_per_leaf | 16 |
| enable_experimental_metrics | OFF |
| error_count | 0 |
| explain_expression_limit | 500 |
| external_user | |
| flush_before_replicate | OFF |
| general_log | OFF |
| geo_sphere_radius | 6367444.657120 |
| hostname | **** |
| identity | 0 |
| kerberos_server_keytab | |
| lc_messages | en_US |
| lc_messages_dir | /data/master-3306/share |
| leaf_failure_detection | ON |
| load_data_max_buffer_size | 1073741823 |
| load_data_read_size | 8192 |
| load_data_write_size | 8192 |
| lock_wait_timeout | 60 |
| master_aggregator | self |
| max_allowed_packet | 104857600 |
| max_connection_threads | 192 |
| max_connections | 100000 |
| max_pooled_connections | 4096 |
| max_prefetch_threads | 1 |
| max_prepared_stmt_count | 16382 |
| max_user_connections | 0 |
| maximum_memory | 506602 |
| maximum_table_memory | 455941 |
| memsql_id | ** |
| memsql_version | 5.7.2 |
| memsql_version_date | Thu Jan 26 12:34:22 2017 -0800 |
| memsql_version_hash | 03e5e3581e96d65caa30756f191323437a3840f0 |
| minimal_disk_space | 100 |
| multi_insert_tuple_count | 20000 |
| net_buffer_length | 102400 |
| net_read_timeout | 3600 |
| net_write_timeout | 3600 |
| pid_file | /data/master-3306/data/memsqld.pid |
| pipelines_batches_metadata_to_keep | 1000 |
| pipelines_extractor_debug_logging | OFF |
| pipelines_kafka_version | 0.8.2.2 |
| pipelines_max_errors_per_partition | 1000 |
| pipelines_max_offsets_per_batch_partition | 1000000 |
| pipelines_max_retries_per_batch_partition | 4 |
| pipelines_stderr_bufsize | 65535 |
| pipelines_stop_on_error | ON |
| plan_expiration_minutes | 720 |
| port | 3306 |
| protocol_version | 10 |
| proxy_user | |
| query_parallelism | 0 |
| redundancy_level | 1 |
| reported_hostname | |
| secure_file_priv | |
| show_query_parameters | ON |
| skip_name_resolve | AUTO |
| snapshot_trigger_size | 268435456 |
| snapshots_to_keep | 2 |
| socket | /data/master-3306/data/memsql.sock |
| sql_quote_show_create | ON |
| ssl_ca | |
| ssl_capath | |
| ssl_cert | |
| ssl_cipher | |
| ssl_key | |
| sync_slave_timeout | 20000 |
| system_time_zone | UTC |
| thread_cache_size | 0 |
| thread_handling | one-thread-per-connection |
| thread_stack | 1048576 |
| time_zone | SYSTEM |
| timestamp | 1504799067.127069 |
| tls_version | TLSv1,TLSv1.1,TLSv1.2 |
| tmpdir | . |
| transaction_buffer | 67108864 |
| tx_isolation | READ-COMMITTED |
| use_join_bucket_bit_vector | ON |
| use_vectorized_join | ON |
| version | 5.5.8 |
| version_comment | MemSQL source distribution (compatible; MySQL Enterprise & MySQL Commercial) |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
| warn_level | WARNINGS |
| warning_count | 0 |
| workload_management | ON |
| workload_management_expected_aggregators | 1 |
| workload_management_max_connections_per_leaf | 1024 |
| workload_management_max_queue_depth | 100 |
| workload_management_max_threads_per_leaf | 8192 |
| workload_management_queue_time_warning_ratio | 0.500000 |
| workload_management_queue_timeout | 3600
一些想法:
- 工作负载分析 (https://docs.memsql.com/concepts/v5.8/workload-profiling-overview/) 可以帮助您了解哪些资源限制了此查询的速度 - 如果不是 cpu 也不是磁盘 io,可能是网络 io 或其他东西
- 查询分析 (https://docs.memsql.com/sql-reference/v5.8/profile/) 将有助于指出查询中的哪些运算符是昂贵的
- 你提到你的机器有 64 个核心 - 你的集群是否正确地进行了 numa 优化? 运行
memsql-ops memsql-optimize
检查。
Jack,我做了工作量分析,发现当 NETWORK_LOGICAL_SEND_B 高
时 lock_time_ms 相当高LAST_FINISHED_TIMESTAMP, LOCK_TIME_MS, LOCK_ROW_TIME_MS, ACTIVITY_NAME, NETWORK_LOGICAL_RECV_B, NETWORK_LOGICAL_SEND_B, activity_name, database_name, partition_id, 左(q.query_text, 50)
'2017-09-11 10:41:31', '39988', '0', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', '0', '538154753', 'InsertSelect_AggregatedHourly_temp_7aug_KING__et_al_c44dc7ab56d56280', 'datawarehouse', '7', 'SELECT \n combined
.day
AS day,\n `lineitem'