与原始 sql 相比,Django ORM 性能较差
Django ORM poor performance compared to raw sql
我正在使用 Django ORM 进行数据查询,在这个 table.I 中我得到了将近 200 万行
app_count = App.objects.count()
和
from django.db import connection
cursor = connection.cursor()
cursor.execute('''SELECT count(*) FROM app''')
mysql slow_query 日志给了我
Time: 2017-04-27T09:18:38.809498Z
User@Host: www[www] @ [172.19.0.3] Id: 5
Query_time: 4.107433 Lock_time: 0.004405 Rows_sent: 1 Rows_examined:
0
use app_platform; SET timestamp=1493284718; SELECT count(*) FROM
app;
这个查询平均用时超过 4 秒,但是当我使用 mysql 客户端和 mysql shell 来执行这个查询时
mysql> select count(*) from app;
+----------+
| count(*) |
+----------+
| 1870019 |
+----------+
1 row in set (0.41 sec)
只需要我 0.4 秒,10 倍的差异,为什么以及如何改进它。
编辑
这是我的模型
class AppMain(models.Model):
"""
"""
store = models.ForeignKey("AppStore", related_name="main_store")
name = models.CharField(max_length=256)
version = models.CharField(max_length=256, blank=True)
developer = models.CharField(db_index=True, max_length=256, blank=True)
md5 = models.CharField(max_length=256, blank=True)
type = models.CharField(max_length=256, blank=True)
size = models.IntegerField(blank=True)
download = models.CharField(max_length=1024, blank=True)
download_md5 = models.CharField(max_length=256, blank=True)
download_times = models.BigIntegerField(blank=True)
snapshot = models.CharField(max_length=2048, blank=True)
description = models.CharField(max_length=5000, blank=True)
app_update_time = models.DateTimeField(blank=True)
create_time = models.DateTimeField(db_index=True, auto_now_add=True)
update_time = models.DateTimeField(auto_now=True)
class Meta:
unique_together = ("store", "name", "version")
编辑 2
我正在为我的项目使用 Docker 和 docker-compose
version: '2'
services:
mysqldb:
restart: always
image: mysql:latest
ports:
- "3306:3306"
environment:
MYSQL_ROOT_PASSWORD: just_for_test
MYSQL_USER: www
MYSQL_PASSWORD: www
MYSQL_DATABASE: app_platform
volumes:
- mysqldata:/var/lib/mysql
- ./config/:/etc/mysql/conf.d
- ./log/mysql/:/var/log/mysql/
web:
restart: always
build: ./app_platform/app_platform
env_file: .env
environment:
PYTHONPATH: '/usr/src/app/app_platform'
command: bash -c "gunicorn --chdir /usr/src/app/app_platform app_platform.wsgi:application -k gevent -w 6 -b :8000 --timeout 8000 --reload"
volumes:
- ./app_platform:/usr/src/app
- ./sqldata:/usr/src/sqldata
- /usr/src/app/static
ports:
- "8000"
dns:
- 114.114.114.114
- 8.8.8.8
links:
- mysqldb
nginx:
restart: always
build: ./nginx/
ports:
- "80:80"
volumes:
- ./app_platform:/usr/src/app
- ./nginx/sites-enabled/:/etc/nginx/sites-enabled
links:
- web:web
volumes:
mysqldata:
我的 Django 设置如下所示:
import os
from django.utils.translation import ugettext_lazy as _
LANGUAGES = (
('en', _('English')),
('zh-CN', _('Chinese')),
)
LANGUAGE_CODE = 'zh-CN'
BASE_DIR = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
LOCALE_PATHS = (
os.path.join(BASE_DIR, "locale"),
)
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'just_for_test'
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'rest_framework',
'app_scrapy',
'app_user',
'app_api',
'app_check',
'common',
'debug_toolbar',
]
MIDDLEWARE_CLASSES = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'debug_toolbar.middleware.DebugToolbarMiddleware',
'django.middleware.locale.LocaleMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
AUTH_USER_MODEL = 'app_user.MyUser'
AUTHENTICATION_BACKENDS = (
'app_user.models.CustomAuth', 'django.contrib.auth.backends.ModelBackend')
ROOT_URLCONF = 'app_platform.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': ["/usr/src/app/app_platform/templates"],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.template.context_processors.i18n',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
WSGI_APPLICATION = 'app_platform.wsgi.application'
LOGIN_REDIRECT_URL = '/'
LOGIN_URL = '/login/'
# Database
# https://docs.djangoproject.com/en/1.9/ref/settings/#databases
# Password validation
# https://docs.djangoproject.com/en/1.9/ref/settings/#auth-password-validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder'
)
# Internationalization
# https://docs.djangoproject.com/en/1.9/topics/i18n/
TIME_ZONE = 'Asia/Shanghai'
USE_I18N = True
USE_L10N = True
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.9/howto/static-files/
STATIC_ROOT = "/static/"
STATIC_URL = '/static/'
STATICFILES_DIRS = (
'public/static/',
)
DEBUG = True
ALLOWED_HOSTS = []
REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': (
'rest_framework.authentication.BasicAuthentication',
'rest_framework.authentication.SessionAuthentication',
),
'DEFAULT_PERMISSION_CLASSES': (
'rest_framework.permissions.AllowAny',
),
'DEFAULT_PAGINATION_CLASS':
'rest_framework.pagination.LimitOffsetPagination',
'PAGE_SIZE': 5,
}
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'app_platform',
'USER': 'www',
'PASSWORD': 'www',
'HOST': 'mysqldb', # Or an IP Address that your DB is hosted on
'PORT': '3306',
}
}
DEBUG_TOOLBAR_CONFIG = {
"SHOW_TOOLBAR_CALLBACK": lambda request: True,
}
我的应用程序table信息
CREATE TABLE `app` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) NOT NULL,
`version` varchar(256) NOT NULL,
`developer` varchar(256) NOT NULL,
`md5` varchar(256) NOT NULL,
`type` varchar(256) NOT NULL,
`size` int(11) NOT NULL,
`download` varchar(1024) NOT NULL,
`download_md5` varchar(256) NOT NULL,
`download_times` bigint(20) NOT NULL,
`snapshot` varchar(2048) NOT NULL,
`description` varchar(5000) NOT NULL,
`app_update_time` datetime(6) NOT NULL,
`create_time` datetime(6) NOT NULL,
`update_time` datetime(6) NOT NULL,
`store_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `app_store_id_6822fab1_uniq` (`store_id`,`name`,`version`),
KEY `app_7473547c` (`store_id`),
KEY `app_developer_b74bcd8e_uniq` (`developer`),
KEY `app_create_time_a071d977_uniq` (`create_time`),
CONSTRAINT `app_store_id_aef091c6_fk_app_scrapy_appstore_id` FOREIGN KEY (`store_id`) REFERENCES `app_scrapy_appstore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1870020 DEFAULT CHARSET=utf8;
编辑 3
这里是 EXPLAIN SELECT COUNT(*) FROM app
;
mysql> EXPLAIN SELECT COUNT(*) FROM `app`;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (0.00 sec)
编辑 4
这是我的 mysql.cnf
innodb_read_io_threads=12
innodb_write_io_threads=12
innodb_io_capacity=300
innodb_read_io_threads=12
innodb_write_io_threads=12 #To stress the double write buffer
innodb_buffer_pool_size=3G
innodb_log_file_size = 32M #Small log files, more page flush
innodb_log_buffer_size=8M
innodb_flush_method=O_DIRECT
我的 docker 设置是 2 个 CPU 和 4GB 内存
编辑 5
当我 运行 在 django shell 中进行 ORM 查询时,只花了我 0.5-1 秒。所以问题是关于 docker 设置?或者 gunicorn 设置?
10X -- 我喜欢。这完全符合我的经验法则:"If the data is not cached, the query will take 10 times as long as if it is cached." (Rick's RoTs)
但是,让我们继续讨论真正的问题:“4.1s 太慢了,我该怎么办。”
更改您的应用,这样您就不需要行数了。您是否注意到搜索引擎不再说 "out of 12345678 hits"?
保持估计,而不是重新计算。
让我们看看EXPLAIN SELECT COUNT(*) FROM app
;它可能会提供更多线索。 (一个地方你说app
,另一个地方你说app_scrapy_appmain
,是一样的吗??)
只要您从不 DELETE
任何行,这会给您相同的答案:SELECT MAX(id) FROM app
和 运行 "instantly"。 (一旦出现DELETE
、ROLLBACK
等),id(s)
就会丢失,所以COUNT
会小于MAX
。)
更多
innodb_buffer_pool_size=3G
在只有 4GB 的 RAM 上可能太多了。如果 MySQL 交换,性能会变得非常糟糕。建议只有2G,除非你能看到它不是交换。
注意:扫描 1.8M 行注定在该硬件上至少需要 0.4s,或者可能在任何硬件上。完成任务需要时间。此外,执行 'long' 查询会以两种方式干扰其他任务:它会在执行查询时消耗 CPU and/or I/O,而且它可能会将其他块从缓存中取出,导致他们减速。所以,我真的认为 'right' 要做的是听从我关于避免 COUNT(*)
的提示。这是另一个:
- 建立并维护一个 "Summary Table" 这个(和其他)table 的每日小计。在其中包括每日
COUNT(*)
以及您可能想要的任何其他内容。这甚至会通过使用此 table 中的 SUM(subtotal)
来缩短 0.4 秒的时间。 More on Summary Tables.
我正在使用 Django ORM 进行数据查询,在这个 table.I 中我得到了将近 200 万行
app_count = App.objects.count()
和
from django.db import connection
cursor = connection.cursor()
cursor.execute('''SELECT count(*) FROM app''')
mysql slow_query 日志给了我
Time: 2017-04-27T09:18:38.809498Z
User@Host: www[www] @ [172.19.0.3] Id: 5
Query_time: 4.107433 Lock_time: 0.004405 Rows_sent: 1 Rows_examined: 0
use app_platform; SET timestamp=1493284718; SELECT count(*) FROM app;
这个查询平均用时超过 4 秒,但是当我使用 mysql 客户端和 mysql shell 来执行这个查询时
mysql> select count(*) from app;
+----------+
| count(*) |
+----------+
| 1870019 |
+----------+
1 row in set (0.41 sec)
只需要我 0.4 秒,10 倍的差异,为什么以及如何改进它。
编辑
这是我的模型
class AppMain(models.Model):
"""
"""
store = models.ForeignKey("AppStore", related_name="main_store")
name = models.CharField(max_length=256)
version = models.CharField(max_length=256, blank=True)
developer = models.CharField(db_index=True, max_length=256, blank=True)
md5 = models.CharField(max_length=256, blank=True)
type = models.CharField(max_length=256, blank=True)
size = models.IntegerField(blank=True)
download = models.CharField(max_length=1024, blank=True)
download_md5 = models.CharField(max_length=256, blank=True)
download_times = models.BigIntegerField(blank=True)
snapshot = models.CharField(max_length=2048, blank=True)
description = models.CharField(max_length=5000, blank=True)
app_update_time = models.DateTimeField(blank=True)
create_time = models.DateTimeField(db_index=True, auto_now_add=True)
update_time = models.DateTimeField(auto_now=True)
class Meta:
unique_together = ("store", "name", "version")
编辑 2
我正在为我的项目使用 Docker 和 docker-compose
version: '2'
services:
mysqldb:
restart: always
image: mysql:latest
ports:
- "3306:3306"
environment:
MYSQL_ROOT_PASSWORD: just_for_test
MYSQL_USER: www
MYSQL_PASSWORD: www
MYSQL_DATABASE: app_platform
volumes:
- mysqldata:/var/lib/mysql
- ./config/:/etc/mysql/conf.d
- ./log/mysql/:/var/log/mysql/
web:
restart: always
build: ./app_platform/app_platform
env_file: .env
environment:
PYTHONPATH: '/usr/src/app/app_platform'
command: bash -c "gunicorn --chdir /usr/src/app/app_platform app_platform.wsgi:application -k gevent -w 6 -b :8000 --timeout 8000 --reload"
volumes:
- ./app_platform:/usr/src/app
- ./sqldata:/usr/src/sqldata
- /usr/src/app/static
ports:
- "8000"
dns:
- 114.114.114.114
- 8.8.8.8
links:
- mysqldb
nginx:
restart: always
build: ./nginx/
ports:
- "80:80"
volumes:
- ./app_platform:/usr/src/app
- ./nginx/sites-enabled/:/etc/nginx/sites-enabled
links:
- web:web
volumes:
mysqldata:
我的 Django 设置如下所示:
import os
from django.utils.translation import ugettext_lazy as _
LANGUAGES = (
('en', _('English')),
('zh-CN', _('Chinese')),
)
LANGUAGE_CODE = 'zh-CN'
BASE_DIR = os.path.dirname(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
LOCALE_PATHS = (
os.path.join(BASE_DIR, "locale"),
)
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'just_for_test'
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'rest_framework',
'app_scrapy',
'app_user',
'app_api',
'app_check',
'common',
'debug_toolbar',
]
MIDDLEWARE_CLASSES = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'debug_toolbar.middleware.DebugToolbarMiddleware',
'django.middleware.locale.LocaleMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
AUTH_USER_MODEL = 'app_user.MyUser'
AUTHENTICATION_BACKENDS = (
'app_user.models.CustomAuth', 'django.contrib.auth.backends.ModelBackend')
ROOT_URLCONF = 'app_platform.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': ["/usr/src/app/app_platform/templates"],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.template.context_processors.i18n',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
WSGI_APPLICATION = 'app_platform.wsgi.application'
LOGIN_REDIRECT_URL = '/'
LOGIN_URL = '/login/'
# Database
# https://docs.djangoproject.com/en/1.9/ref/settings/#databases
# Password validation
# https://docs.djangoproject.com/en/1.9/ref/settings/#auth-password-validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder'
)
# Internationalization
# https://docs.djangoproject.com/en/1.9/topics/i18n/
TIME_ZONE = 'Asia/Shanghai'
USE_I18N = True
USE_L10N = True
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.9/howto/static-files/
STATIC_ROOT = "/static/"
STATIC_URL = '/static/'
STATICFILES_DIRS = (
'public/static/',
)
DEBUG = True
ALLOWED_HOSTS = []
REST_FRAMEWORK = {
'DEFAULT_AUTHENTICATION_CLASSES': (
'rest_framework.authentication.BasicAuthentication',
'rest_framework.authentication.SessionAuthentication',
),
'DEFAULT_PERMISSION_CLASSES': (
'rest_framework.permissions.AllowAny',
),
'DEFAULT_PAGINATION_CLASS':
'rest_framework.pagination.LimitOffsetPagination',
'PAGE_SIZE': 5,
}
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'app_platform',
'USER': 'www',
'PASSWORD': 'www',
'HOST': 'mysqldb', # Or an IP Address that your DB is hosted on
'PORT': '3306',
}
}
DEBUG_TOOLBAR_CONFIG = {
"SHOW_TOOLBAR_CALLBACK": lambda request: True,
}
我的应用程序table信息
CREATE TABLE `app` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(256) NOT NULL,
`version` varchar(256) NOT NULL,
`developer` varchar(256) NOT NULL,
`md5` varchar(256) NOT NULL,
`type` varchar(256) NOT NULL,
`size` int(11) NOT NULL,
`download` varchar(1024) NOT NULL,
`download_md5` varchar(256) NOT NULL,
`download_times` bigint(20) NOT NULL,
`snapshot` varchar(2048) NOT NULL,
`description` varchar(5000) NOT NULL,
`app_update_time` datetime(6) NOT NULL,
`create_time` datetime(6) NOT NULL,
`update_time` datetime(6) NOT NULL,
`store_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `app_store_id_6822fab1_uniq` (`store_id`,`name`,`version`),
KEY `app_7473547c` (`store_id`),
KEY `app_developer_b74bcd8e_uniq` (`developer`),
KEY `app_create_time_a071d977_uniq` (`create_time`),
CONSTRAINT `app_store_id_aef091c6_fk_app_scrapy_appstore_id` FOREIGN KEY (`store_id`) REFERENCES `app_scrapy_appstore` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1870020 DEFAULT CHARSET=utf8;
编辑 3
这里是 EXPLAIN SELECT COUNT(*) FROM app
;
mysql> EXPLAIN SELECT COUNT(*) FROM `app`;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
1 row in set, 1 warning (0.00 sec)
编辑 4
这是我的 mysql.cnf
innodb_read_io_threads=12
innodb_write_io_threads=12
innodb_io_capacity=300
innodb_read_io_threads=12
innodb_write_io_threads=12 #To stress the double write buffer
innodb_buffer_pool_size=3G
innodb_log_file_size = 32M #Small log files, more page flush
innodb_log_buffer_size=8M
innodb_flush_method=O_DIRECT
我的 docker 设置是 2 个 CPU 和 4GB 内存
编辑 5
当我 运行 在 django shell 中进行 ORM 查询时,只花了我 0.5-1 秒。所以问题是关于 docker 设置?或者 gunicorn 设置?
10X -- 我喜欢。这完全符合我的经验法则:"If the data is not cached, the query will take 10 times as long as if it is cached." (Rick's RoTs)
但是,让我们继续讨论真正的问题:“4.1s 太慢了,我该怎么办。”
更改您的应用,这样您就不需要行数了。您是否注意到搜索引擎不再说 "out of 12345678 hits"?
保持估计,而不是重新计算。
让我们看看
EXPLAIN SELECT COUNT(*) FROM app
;它可能会提供更多线索。 (一个地方你说app
,另一个地方你说app_scrapy_appmain
,是一样的吗??)只要您从不
DELETE
任何行,这会给您相同的答案:SELECT MAX(id) FROM app
和 运行 "instantly"。 (一旦出现DELETE
、ROLLBACK
等),id(s)
就会丢失,所以COUNT
会小于MAX
。)
更多
innodb_buffer_pool_size=3G
在只有 4GB 的 RAM 上可能太多了。如果 MySQL 交换,性能会变得非常糟糕。建议只有2G,除非你能看到它不是交换。
注意:扫描 1.8M 行注定在该硬件上至少需要 0.4s,或者可能在任何硬件上。完成任务需要时间。此外,执行 'long' 查询会以两种方式干扰其他任务:它会在执行查询时消耗 CPU and/or I/O,而且它可能会将其他块从缓存中取出,导致他们减速。所以,我真的认为 'right' 要做的是听从我关于避免 COUNT(*)
的提示。这是另一个:
- 建立并维护一个 "Summary Table" 这个(和其他)table 的每日小计。在其中包括每日
COUNT(*)
以及您可能想要的任何其他内容。这甚至会通过使用此 table 中的SUM(subtotal)
来缩短 0.4 秒的时间。 More on Summary Tables.