MySQL 5.7:微调太多连接错误

MySQL 5.7: fine tuning for too many connections error

我有一个旧的 django 1.9 应用程序连接到 mysql 5.7 数据库。 有时我会收到此错误:

[Fri Mar 20 10:27:19.482820 2020] [wsgi:error] [pid 15215] "
[Fri Mar 20 10:28:20.502889 2020] [wsgi:error] [pid 15215] [NXU ts_fw20] [2020-03-20 Fri 10:28:20.1584696500]ERROR   "Exception during GET request (user: AnonymousUser): <QueryDict: {u'udid': [u'B8663A17-C121-4691-A561-E0F698CDD4EA'], u'client_os_version': [u'13.3.1'], u'client_app_version': [u'11.30.1.201910161040'], u'client_app_type': [u'Jukebox'], u'client_app_name': [u'p_ts_w_fw20']}> ->
[Fri Mar 20 10:28:20.502976 2020] [wsgi:error] [pid 15215] Traceback (most recent call last):
[Fri Mar 20 10:28:20.502991 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/core/handlers/base.py", line 147, in get_response
[Fri Mar 20 10:28:20.503041 2020] [wsgi:error] [pid 15215]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Mar 20 10:28:20.503101 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/nxlib/core/services/utils/api.py", line 307, in wrapper
[Fri Mar 20 10:28:20.503114 2020] [wsgi:error] [pid 15215]     user = authenticator.check_auth(request)
[Fri Mar 20 10:28:20.503123 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/nxlib/core/networking/auth.py", line 21, in check_auth
[Fri Mar 20 10:28:20.503133 2020] [wsgi:error] [pid 15215]     user = self.authenticate(credentials)
[Fri Mar 20 10:28:20.503142 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/nxlib/core/services/utils/api.py", line 387, in authenticate
[Fri Mar 20 10:28:20.503151 2020] [wsgi:error] [pid 15215]     user = auth.authenticate(username=username, password=password)
[Fri Mar 20 10:28:20.503160 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/contrib/auth/__init__.py", line 74, in authenticate
[Fri Mar 20 10:28:20.503168 2020] [wsgi:error] [pid 15215]     user = backend.authenticate(**credentials)
[Fri Mar 20 10:28:20.503177 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/contrib/auth/backends.py", line 17, in authenticate
[Fri Mar 20 10:28:20.503186 2020] [wsgi:error] [pid 15215]     user = UserModel._default_manager.get_by_natural_key(username)
[Fri Mar 20 10:28:20.503194 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/contrib/auth/base_user.py", line 45, in get_by_natural_key
[Fri Mar 20 10:28:20.503203 2020] [wsgi:error] [pid 15215]     return self.get(**{self.model.USERNAME_FIELD: username})
[Fri Mar 20 10:28:20.503211 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/manager.py", line 122, in manager_method
[Fri Mar 20 10:28:20.503220 2020] [wsgi:error] [pid 15215]     return getattr(self.get_queryset(), name)(*args, **kwargs)
[Fri Mar 20 10:28:20.503228 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/query.py", line 381, in get
[Fri Mar 20 10:28:20.503237 2020] [wsgi:error] [pid 15215]     num = len(clone)
[Fri Mar 20 10:28:20.503244 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/query.py", line 240, in __len__
[Fri Mar 20 10:28:20.503253 2020] [wsgi:error] [pid 15215]     self._fetch_all()
[Fri Mar 20 10:28:20.503261 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/query.py", line 1074, in _fetch_all
[Fri Mar 20 10:28:20.503269 2020] [wsgi:error] [pid 15215]     self._result_cache = list(self.iterator())
[Fri Mar 20 10:28:20.503278 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/query.py", line 52, in __iter__
[Fri Mar 20 10:28:20.503286 2020] [wsgi:error] [pid 15215]     results = compiler.execute_sql()
[Fri Mar 20 10:28:20.503294 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 846, in execute_sql
[Fri Mar 20 10:28:20.503302 2020] [wsgi:error] [pid 15215]     cursor = self.connection.cursor()
[Fri Mar 20 10:28:20.503311 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/base/base.py", line 233, in cursor
[Fri Mar 20 10:28:20.503319 2020] [wsgi:error] [pid 15215]     cursor = self.make_cursor(self._cursor())
[Fri Mar 20 10:28:20.503327 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/base/base.py", line 204, in _cursor
[Fri Mar 20 10:28:20.503345 2020] [wsgi:error] [pid 15215]     self.ensure_connection()
[Fri Mar 20 10:28:20.503354 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/base/base.py", line 199, in ensure_connection
[Fri Mar 20 10:28:20.503363 2020] [wsgi:error] [pid 15215]     self.connect()
[Fri Mar 20 10:28:20.503372 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/utils.py", line 95, in __exit__
[Fri Mar 20 10:28:20.503380 2020] [wsgi:error] [pid 15215]     six.reraise(dj_exc_type, dj_exc_value, traceback)
[Fri Mar 20 10:28:20.503389 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/base/base.py", line 199, in ensure_connection
[Fri Mar 20 10:28:20.503398 2020] [wsgi:error] [pid 15215]     self.connect()
[Fri Mar 20 10:28:20.503406 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/base/base.py", line 171, in connect
[Fri Mar 20 10:28:20.503415 2020] [wsgi:error] [pid 15215]     self.connection = self.get_new_connection(conn_params)
[Fri Mar 20 10:28:20.503423 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 264, in get_new_connection
[Fri Mar 20 10:28:20.503433 2020] [wsgi:error] [pid 15215]     conn = Database.connect(**conn_params)
[Fri Mar 20 10:28:20.503441 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
[Fri Mar 20 10:28:20.503450 2020] [wsgi:error] [pid 15215]     return Connection(*args, **kwargs)
[Fri Mar 20 10:28:20.503458 2020] [wsgi:error] [pid 15215]   File "/home/nuxie/virtualenv/110-fw20/production/lib/python2.7/site-packages/MySQLdb/connections.py", line 193, in __init__
[Fri Mar 20 10:28:20.503467 2020] [wsgi:error] [pid 15215]     super(Connection, self).__init__(*args, **kwargs2)
[Fri Mar 20 10:28:20.503475 2020] [wsgi:error] [pid 15215] OperationalError: (1040, 'Too many connections')

我的初始 MySQL 设置:

max_connections                     150

connect_timeout                     10
delayed_insert_timeout              300
have_statement_timeout              YES
innodb_flush_log_at_timeout         1
innodb_lock_wait_timeout            50
innodb_rollback_on_timeout          OFF
interactive_timeout                 28800
lock_wait_timeout                   31536000
net_read_timeout                    30
net_write_timeout                   60
rpl_stop_slave_timeout              31536000
slave_net_timeout                   60
wait_timeout                        28800

bulk_insert_buffer_size             8388608
innodb_buffer_pool_chunk_size       134217728
innodb_buffer_pool_dump_at_shutdown ON
innodb_buffer_pool_dump_now         OFF
innodb_buffer_pool_dump_pct         25
innodb_buffer_pool_filename         ib_buffer_pool
innodb_buffer_pool_instances        1
innodb_buffer_pool_load_abort       OFF
innodb_buffer_pool_load_at_startup  ON
innodb_buffer_pool_load_now         OFF
innodb_buffer_pool_size             134217728
innodb_change_buffer_max_size       25
innodb_change_buffering             all
innodb_log_buffer_size              16777216
innodb_sort_buffer_size             1048576
join_buffer_size                    262144
key_buffer_size                     16777216
myisam_sort_buffer_size             8388608
net_buffer_length                   16384
preload_buffer_size                 32768
read_buffer_size                    131072
read_rnd_buffer_size                262144
sort_buffer_size                    262144
sql_buffer_result                   OFF

max_allowed_packet                  16777216
slave_max_allowed_packet            1073741824

这里还有一些关于服务器的额外硬件信息。我不知道 MySQL Host 服务器上是否有任何 SSD 或 NVME 设备,因为它是一个虚拟机,不是我管理的,我只管理 django 应用程序。 当然,现在我也想避免来自 mysql.

的错误
$ ps -ef|grep apache2|grep -v grep -c
11

Mysql服务状态:

$ sudo service mysql status
● mysql.service - MySQL Community Server
  Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
  Active: active (running) since Fri 2020-03-20 10:28:33 CET; 3h 31min ago
 Process: 17847 ExecStartPost=/usr/share/mysql/mysql-systemd-start post (code=exited, status=0/SUCCESS)
 Process: 17834 ExecStartPre=/usr/share/mysql/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
 Main PID: 17846 (mysqld)
  Tasks: 166
  Memory: 494.0M
   CPU: 57min 24.058s
  CGroup: /system.slice/mysql.service
      └─17846 /usr/sbin/mysqld
Mar 20 10:28:31 _ systemd[1]: Starting MySQL Community Server...
Mar 20 10:28:33 _ systemd[1]: Started MySQL Community Server.

CPU信息:

$ lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 62
Model name:            Intel(R) Xeon(R) CPU E5-2650L v2 @ 1.70GHz
Stepping:              4
CPU MHz:               1700.000
BogoMIPS:              3400.00
Hypervisor vendor:     VMware
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm ibrs ibpb stibp kaiser fsgsbase tsc_adjust smep arat arch_capabilities

可用内存:

$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32175        2277        1465         369       28432       28317
Swap:           975           2         972

fs.file-最大当前值:

# sysctl -a |grep file-max
fs.file-max = 3282776

多亏了这个 article from Percona 我也在考虑更改 wait_timeout 变量,但我不确定要使用的值。

关于我的 mysql 服务器的一般微调有什么建议吗?


编辑:我也试过 MySQLTuner-perl,但我没有看到任何与我的实际问题相关的内容:

$ perl mysqltuner.pl --host 127.0.0.1
 >>  MySQLTuner 1.7.19 - Major Hayden <major@mhtx.net>
 >>  Bug reports, feature requests, and downloads at http://mysqltuner.com/
 >>  Run with '--help' for additional options and output filtering
[--] Skipped version check for MySQLTuner script
[--] Performing tests on 127.0.0.1:3306
Please enter your MySQL administrative login: root
Please enter your MySQL administrative password: [OK] Currently running supported MySQL version 5.7.25-0ubuntu0.16.04.2
[OK] Operating on 64-bit architecture
-------- Log file Recommendations ------------------------------------------------------------------
[OK] Log file /var/log/mysql/error.log exists
[--] Log file: /var/log/mysql/error.log(744B)
[OK] Log file /var/log/mysql/error.log is readable.
[OK] Log file /var/log/mysql/error.log is not empty
[OK] Log file /var/log/mysql/error.log is smaller than 32 Mb
[OK] /var/log/mysql/error.log doesn't contain any warning.
[!!] /var/log/mysql/error.log contains 2 error(s).
[--] 0 start(s) detected in /var/log/mysql/error.log
[--] 0 shutdown(s) detected in /var/log/mysql/error.log
-------- Storage Engine Statistics -----------------------------------------------------------------
[--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MEMORY +MRG_MYISAM +MyISAM +PERFORMANCE_SCHEMA 
[--] Data in MyISAM tables: 1.0K (Tables: 1)
[--] Data in InnoDB tables: 396.9M (Tables: 4650)
[OK] Total fragmented tables: 0
-------- Analysis Performance Metrics --------------------------------------------------------------
[--] innodb_stats_on_metadata: OFF
[OK] No stat updates during querying INFORMATION_SCHEMA.
-------- Security Recommendations ------------------------------------------------------------------
[OK] There are no anonymous accounts for any database users
[OK] All database users have passwords assigned
[!!] There is no basic password file list!
-------- CVE Security Recommendations --------------------------------------------------------------
[--] Skipped due to --cvefile option undefined
-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 302d 0h 54m 4s (34M q [1.320 qps], 19K conn, TX: 44G, RX: 19G)
[--] Reads / Writes: 99% / 1%
[--] Binary logging is disabled
[--] Physical Memory     : 31.4G
[--] Max MySQL memory    : 3.5G
[--] Other process memory: 0B
[--] Total buffers: 192.0M global + 17.1M per thread (201 max threads)
[--] P_S Max memory usage: 72B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 1.4G (4.31% of installed RAM)
[OK] Maximum possible memory usage: 3.5G (11.26% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (0/34M)
[OK] Highest usage of available connections: 34% (70/201)
[OK] Aborted connections: 0.15%  (30/19766)
[!!] name resolution is active : a reverse name resolution is made for each new connection and can reduce performance
[!!] Query cache may be disabled by default due to mutex contention.
[!!] Query cache efficiency: 0.0% (0 cached / 34M selects)
[OK] Query cache prunes per day: 0
[OK] Sorts requiring temporary tables: 4% (10K temp sorts / 205K sorts)
[OK] No joins without indexes
[!!] Temporary tables created on disk: 59% (46K on disk / 77K total)
[OK] Thread cache hit rate: 96% (700 created / 19K connections)
[!!] Table cache hit rate: 0% (416 open / 411K opened)
[!!] table_definition_cache(615) is lower than number of tables(4930) 
[OK] Open file limit used: 0% (7/1K)
[OK] Table locks acquired immediately: 100% (2K immediate / 2K locks)
-------- Performance schema ------------------------------------------------------------------------
[--] Memory used by P_S: 72B
[--] Sys schema is installed.
-------- ThreadPool Metrics ------------------------------------------------------------------------
[--] ThreadPool stat is disabled.
-------- MyISAM Metrics ----------------------------------------------------------------------------
[!!] Key buffer used: 18.3% (3M used / 16M cache)
[OK] Key buffer size / total MyISAM indexes: 16.0M/48.0K
[OK] Read Key buffer hit rate: 99.6% (14K cached / 66 reads)
[OK] Write Key buffer hit rate: 100.0% (121 cached / 121 writes)
-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[!!] InnoDB buffer pool / data size: 128.0M/396.9M
[!!] Ratio InnoDB log file size / InnoDB Buffer pool size (75 %): 48.0M * 2/128.0M should be equal to 25%
[OK] InnoDB buffer pool instances: 1
[--] Number of InnoDB Buffer Pool Chunk : 1 for 1 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 100.00% (18914488493 hits/ 18914570989 total)
[OK] InnoDB Write log efficiency: 96.81% (1249437 hits/ 1290575 total)
[OK] InnoDB log waits: 0.00% (0 waits / 41138 writes)
-------- AriaDB Metrics ----------------------------------------------------------------------------
[--] AriaDB is disabled.
-------- TokuDB Metrics ----------------------------------------------------------------------------
[--] TokuDB is disabled.
-------- XtraDB Metrics ----------------------------------------------------------------------------
[--] XtraDB is disabled.
-------- Galera Metrics ----------------------------------------------------------------------------
[--] Galera is disabled.
-------- Replication Metrics -----------------------------------------------------------------------
[--] Galera Synchronous replication: NO
[--] No replication slave(s) for this server.
[--] Binlog format: ROW
[--] XA support enabled: ON
[--] Semi synchronous replication Master: Not Activated
[--] Semi synchronous replication Slave: Not Activated
[--] This is a standalone server
-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    Control error line(s) into /var/log/mysql/error.log file
    Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
    When making adjustments, make tmp_table_size/max_heap_table_size equal
    Reduce your SELECT DISTINCT queries which have no LIMIT clause
    Increase table_open_cache gradually to avoid file descriptor limits
    Read this before increasing table_open_cache over 64: https://zzz
    Read this before increasing for MariaDB https://mariadb.com/kb/en/library/optimizing-table_open_cache/
    This is MyISAM only table_cache scalability problem, InnoDB not affected.
    See more details here: https://bugs.mysql.com/bug.php?id=49177
    This bug already fixed in MySQL 5.7.9 and newer MySQL versions.
    Beware that open_files_limit (1024) variable 
    should be greater than table_open_cache (431)
    Before changing innodb_log_file_size and/or innodb_log_files_in_group read this: https://xxx
Variables to adjust:
    query_cache_size (=0)
    query_cache_type (=0)
    query_cache_limit (> 1M, or use smaller result sets)
    tmp_table_size (> 16M)
    max_heap_table_size (> 16M)
    table_open_cache (> 431)
    table_definition_cache(615) > 4930 or -1 (autosizing if supported)
    innodb_buffer_pool_size (>= 396.9M) if possible.
    innodb_log_file_size should be (=16M) if possible, so InnoDB total log files size equals to 25% of buffer pool size.

但是,我注意到一开始我有:

SELECT @@open_files_limit;
1024

为了处理这个限制,我遵循了(目前在我笔记本上的本地 Ubuntu 安装中)

我拆分了代码块,希望能提高可读性。周一我还会添加关于 mysql 全局状态、完整进程列表和 iowait 统计的完整信息(添加全局变量 link here)。

操作系统是Ubuntu16.04.

我现在正在使用:


更新 2020-03-30 14:58

此处 SHOW GLOBAL STATUS;SHOW FULL PROCESSLIST; 的实际输出:

Id,User,Host,db,Command,Time,State,Info
1810,root,localhost:60044,NULL,Sleep,33,,NULL
1811,root,localhost:60046,NULL,Query,0,starting,"show full processlist"

这里是 ulimit -a 的完整输出,更新于今天 (2020-03-30 14:42)

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 128535
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 24000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 128535
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

show global variables; 的完整输出(上周没有变化)。

还有默认的 wait_timeout 目前我没有任何重大缺点。

提前致谢。

我注意到您在 运行 ulimit -n 1024 时将 ubuntu 中的打开文件句柄设置为 1024。您的系统需要更多可用的打开文件句柄来支持您的 4650 innodb 表。建议 ulimit -n 24000 然后使用 ulimit -a 检查设置(列出当前限制)。我期待着下周从生产实例中分析您的数据。

建议你考虑一下 ulimit -a

ulimit -n 24000 # 从 1024 启用更多 OS 文件句柄处于活动状态。

以上是LinuxOS的动态。 Stop/Start 服务可以访问句柄。 要使此在 OS shutdown/restart 中持续存在,请查看此 url 以获取类似的 OS 说明。 这些说明将 file-max 设置为 500000,请暂时将容量设置为 24000。 ulimit 请设置为 24000,这将允许 MySQL 使用请求的限制 并为其他应用程序备件

https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/

每秒速率 = RPS

my.cnf [mysqld] 部分要考虑的建议

innodb_buffer_pool_size=4G  # from 128M to reduce innodb_buffer_pool_reads RPS of 1
innodb_io_capacity=800  # from 200 to use more of your HDD IOPS
read_rnd_buffer_size=128K  # from 256K to reduce handler_read_rnd_next RPS of 42,434
read_buffer_size=512K  # from 128K to reduce handler_read_next RPS of 5,541
open_files_limit=20000  # from 32000 to allow OS support of other apps handles.

您会发现这些更改会显着减少您的 CPU BUSY。

查看我的个人资料、联系信息的网络个人资料和可免费下载的实用程序脚本以协助性能调整,特别是 findfragtables.sql 和查找-redundant_indexes.sql 脚本。