在 Django 中将两个不相关的 tables/models 与相同的主键组合起来
Combine two unrelated tables/models with same primary key in Django
我有两个不相关的 table 具有相同的主键。
ip mac
11.11.11.11 48-C0-09-1F-9B-54
33.33.33.33 4E-10-A3-BC-B8-9D
44.44.44.44 CD-00-60-08-56-2A
55.55.55.55 23-CE-D3-B1-39-A6
ip type owner
22.22.22.22 laptop John Doe
33.33.33.33 server XYZ Department
44.44.44.44 VM Mary Smith
66.66.66.66 printer ZWV Department
第一个table每分钟自动刷新一次。我无法更改数据库结构或填充它的脚本。
两个 table 都有 ip
作为主键。
在视图中,我想显示这样的 table:
ip mac type owner Alert
11.11.11.11 48-C0-09-1F-9B-54 Unauthorized
55.55.55.55 23-CE-D3-B1-39-A6 Unauthorized
22.22.22.22 laptop John Doe Down
66.66.66.66 printer ZWV Department Down
33.33.33.33 4E-10-A3-BC-B8-9D server XYZ Department OK
44.44.44.44 CD-00-60-08-56-2A VM Mary Smith OK
我该如何建模?我应该将两个主键中的一个作为另一个外键吗?
一旦代码运行起来,数据量会很大,所以我想确保它足够快。
检索数据最快的方法是什么?
更新:
我尝试使用 OneToOneField
作为第二个 table。
这有助于我获取 table 中的记录,以及未经授权设备的记录(第二个 table 中缺少 IP):
ip mac type owner Alert
11.11.11.11 48-C0-09-1F-9B-54 Unauthorized
55.55.55.55 23-CE-D3-B1-39-A6 Unauthorized
33.33.33.33 4E-10-A3-BC-B8-9D server XYZ Department OK
44.44.44.44 CD-00-60-08-56-2A VM Mary Smith OK
但我无法获取已关闭的设备(第一个 IP 丢失 table):
22.22.22.22 laptop John Doe Down
66.66.66.66 printer ZWV Department Down
我寻求帮助here,但似乎无法用OneToOneField
完成
由于 ip
是主键,第一个 table 经常更新,我建议更新第二个 table 并在第二个中转换 ip
table 将第一个 table 的 ip
作为 OneToOneField
。
您的模型应该是这样的:
class ModelA(models.Model):
ip = models.GenericIPAddressField(unique=True)
mac = models.CharField(max_length=17, null=True, blank=True)
class ModelB(models.Model):
ip = models.OneToOneField(ModelA)
type = models.CharField()
owner = models.CharField()
您还可以使用单独的列建立一对一关系:
class ModelB(models.Model):
ip = models.GenericIPAddressField(unique=True)
type = models.CharField()
owner = models.CharField()
modelA = models.OneToOneField(ModelA)
所以现在您可以将 ip 地址作为主键,您仍然可以使用字段 modelA
.
来引用 table ModelA
一旦从两个 table 中的一个中获得值,只需对另一个进行查询,查找 id。由于这两个 table 是分开的,因此您必须执行额外的查询。您不需要创建显式关系,因为您正在查看它的 "id/ip"。因此,一旦您有了第一个值,名为 'first_object',只需在另一个 table 中寻找它的相对值。
other_columns = ModelB.objects.get(id=first_object.id)
然后,如果您只想 'add' 所需的列到另一个模型并将单个对象发送到您想要的任何对象:
first_object.attr1 = other_columns.attr1
...
总体思路
您可以使用 qs.union:
- 创建 2 个模型,它们之间没有任何关系。不要忘记使用
class Meta: managed = False
- select 来自第一个模型,用子查询注释并与第二个联合:
from django.db import models
from django.db.models import F, OuterRef, Subquery, Value
from django.db.models.functions import Coalesce
# OperationalDevice fields: ip, mac
# AllowedDevice fields: ip, type, owner
USE_EMPTY_STR_AS_DEFAULT = True
null_char_field = models.CharField(null=True)
if USE_EMPTY_STR_AS_DEFAULT:
default_value = ''
else:
default_value = None
# By default Expressions treat strings as "field_name" so if you want to use
# empty string as a second argument for Coalesce, then you should wrap it in
# `Value()`.
# `None` can be used there without wrapping in `Value()`, but in
# `.annotate(type=NoneValue)` it still should be wrapped, so it's easier to
# just "always wrap".
default_value = Value(default_value, output_field=null_char_field)
operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef('ip'))
qs1 = (
AllowedDevice.objects
.all()
.annotate(
mac=Coalesce(
Subquery(operational_devices_subquery.values('mac')[:1]),
default_value,
output_field=null_char_field,
),
)
)
qs2 = (
OperationalDevice.objects
.exclude(
ip__in=qs1.values('ip'),
)
.annotate(
type=default_value,
owner=default_value,
)
)
final_qs = qs1.union(qs2)
多个字段的通用方法
更复杂但 "universal" 的方法可以使用 Model._meta.get_fields()
。对于 "second" 模型有超过 1 个额外字段(不仅是 ip,mac
)的情况,使用起来会更容易。示例代码(未测试,但给出一般印象):
# One more import:
from django.db.models.fields import NOT_PROVIDED
common_field_name = 'ip'
# OperationalDevice fields: ip, mac, some_more_fields ...
# AllowedDevice fields: ip, type, owner
operational_device_fields = OperationalDevice._meta.get_fields()
operational_device_fields_names = {_f.name for _f in operational_device_fields} # or set((_f.name for ...))
allowed_device_fields = AllowedDevice._meta.get_fields()
allowed_device_fields_names = {_f.name for _f in allowed_device_fields} # or set((_f.name for ...))
operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef(common_field_name))
left_joined_qs = ( # "Kind-of". Assuming AllowedDevice to be "left" and OperationalDevice to be "right"
AllowedDevice.objects
.all()
.annotate(
**{
_f.name: Coalesce(
Subquery(operational_devices_subquery.values(_f.name)[1]),
Value(_f.get_default()), # Use defaults from model definition
output_field=_f,
)
for _f in operational_device_fields
if _f.name not in allowed_device_fields_names
# NOTE: if fields other than `ip` "overlap", then you might consider
# changing logic here. Current implementation keeps fields from the
# AllowedDevice
}
# Unpacked dict is partially equivalent to this:
# mac=Coalesce(
# Subquery(operational_devices_subquery.values('mac')[:1]),
# default_for_mac_eg_fallback_text_value,
# output_field=null_char_field,
# ),
# other_field = Coalesce(...),
# ...
)
)
lonely_right_rows_qs = (
OperationalDevice.objects
.exclude(
ip__in=AllowedDevice.objects.all().values(common_field_name),
)
.annotate(
**{
_f.name: Value(_f.get_default(), output_field=_f), # Use defaults from model definition
for _f in allowed_device_fields
if _f.name not in operational_device_fields_names
# NOTE: See previous NOTE
}
)
)
final_qs = left_joined_qs.union(lonely_right_rows_qs)
将 OneToOneField 用于 "better" SQL
理论上您可以在 AllowedDevice
中使用 device_info = models.OneToOneField(OperationalDevice, db_column='ip', primary_key=True, related_name='status_info')
:。在这种情况下,您的第一个 QS 可以在不使用 Subquery
:
的情况下定义
from django.db.models import F
# Now 'ip' is not in field names ('device_info' is there), so add it:
allowed_device_fields_names.add(common_field_name)
# NOTE: I think this approach will result in a more compact SQL query without
# multiple `(SELECT "some_field" FROM device_info_table ... ) as "some-field"`.
# This also might result in better query performance.
honest_join_qs = (
AllowedDevice.objects
.all()
.annotate(
**{
_f.name: F(f'device_info__{_f.name}')
for _f in operational_device_fields
if _f.name not in allowed_device_fields_names
}
)
)
final_qs = honest_join_qs.union(lonely_right_rows_qs)
# or:
# final_qs = honest_join_qs.union(
# OperationalDevice.objects.filter(status_info__isnull=True).annotate(**missing_fields_annotation)
# )
# I'm not sure which approach is better performance-wise...
# Commented one will use something like:
# `SELECT ... FROM "device_info_table" LEFT OUTER JOIN "status_info_table" ON ("device_info_table"."ip" = "status_info_table"."ip") WHERE "status_info_table"."ip" IS NULL
#
# So it might be a little better than first with `union(QS.exclude(ip__in=honest_join_qs.values('ip'))`.
# Because later uses SQL like this:
# `SELECT ... FROM "device_info_table" WHERE NOT ip IN (SELECT ip FROM "status_info_table")`
#
# But it's better to measure timings of both approaches to be sure.
# @GrannyAching, can you compare them and tell in the comments which one is better ?
P.S。要自动化模型定义,您可以使用 manage.py inspectdb
P.P.S。也许 multi-table inheritance 与自定义 OneToOneField(..., parent_link=True)
比使用 union
.
对您更有帮助
我有两个不相关的 table 具有相同的主键。
ip mac
11.11.11.11 48-C0-09-1F-9B-54
33.33.33.33 4E-10-A3-BC-B8-9D
44.44.44.44 CD-00-60-08-56-2A
55.55.55.55 23-CE-D3-B1-39-A6
ip type owner
22.22.22.22 laptop John Doe
33.33.33.33 server XYZ Department
44.44.44.44 VM Mary Smith
66.66.66.66 printer ZWV Department
第一个table每分钟自动刷新一次。我无法更改数据库结构或填充它的脚本。
两个 table 都有 ip
作为主键。
在视图中,我想显示这样的 table:
ip mac type owner Alert
11.11.11.11 48-C0-09-1F-9B-54 Unauthorized
55.55.55.55 23-CE-D3-B1-39-A6 Unauthorized
22.22.22.22 laptop John Doe Down
66.66.66.66 printer ZWV Department Down
33.33.33.33 4E-10-A3-BC-B8-9D server XYZ Department OK
44.44.44.44 CD-00-60-08-56-2A VM Mary Smith OK
我该如何建模?我应该将两个主键中的一个作为另一个外键吗?
一旦代码运行起来,数据量会很大,所以我想确保它足够快。
检索数据最快的方法是什么?
更新:
我尝试使用 OneToOneField
作为第二个 table。
这有助于我获取 table 中的记录,以及未经授权设备的记录(第二个 table 中缺少 IP):
ip mac type owner Alert
11.11.11.11 48-C0-09-1F-9B-54 Unauthorized
55.55.55.55 23-CE-D3-B1-39-A6 Unauthorized
33.33.33.33 4E-10-A3-BC-B8-9D server XYZ Department OK
44.44.44.44 CD-00-60-08-56-2A VM Mary Smith OK
但我无法获取已关闭的设备(第一个 IP 丢失 table):
22.22.22.22 laptop John Doe Down
66.66.66.66 printer ZWV Department Down
我寻求帮助here,但似乎无法用OneToOneField
由于 ip
是主键,第一个 table 经常更新,我建议更新第二个 table 并在第二个中转换 ip
table 将第一个 table 的 ip
作为 OneToOneField
。
您的模型应该是这样的:
class ModelA(models.Model):
ip = models.GenericIPAddressField(unique=True)
mac = models.CharField(max_length=17, null=True, blank=True)
class ModelB(models.Model):
ip = models.OneToOneField(ModelA)
type = models.CharField()
owner = models.CharField()
您还可以使用单独的列建立一对一关系:
class ModelB(models.Model):
ip = models.GenericIPAddressField(unique=True)
type = models.CharField()
owner = models.CharField()
modelA = models.OneToOneField(ModelA)
所以现在您可以将 ip 地址作为主键,您仍然可以使用字段 modelA
.
ModelA
一旦从两个 table 中的一个中获得值,只需对另一个进行查询,查找 id。由于这两个 table 是分开的,因此您必须执行额外的查询。您不需要创建显式关系,因为您正在查看它的 "id/ip"。因此,一旦您有了第一个值,名为 'first_object',只需在另一个 table 中寻找它的相对值。
other_columns = ModelB.objects.get(id=first_object.id)
然后,如果您只想 'add' 所需的列到另一个模型并将单个对象发送到您想要的任何对象:
first_object.attr1 = other_columns.attr1
...
总体思路
您可以使用 qs.union:
- 创建 2 个模型,它们之间没有任何关系。不要忘记使用
class Meta: managed = False
- select 来自第一个模型,用子查询注释并与第二个联合:
from django.db import models
from django.db.models import F, OuterRef, Subquery, Value
from django.db.models.functions import Coalesce
# OperationalDevice fields: ip, mac
# AllowedDevice fields: ip, type, owner
USE_EMPTY_STR_AS_DEFAULT = True
null_char_field = models.CharField(null=True)
if USE_EMPTY_STR_AS_DEFAULT:
default_value = ''
else:
default_value = None
# By default Expressions treat strings as "field_name" so if you want to use
# empty string as a second argument for Coalesce, then you should wrap it in
# `Value()`.
# `None` can be used there without wrapping in `Value()`, but in
# `.annotate(type=NoneValue)` it still should be wrapped, so it's easier to
# just "always wrap".
default_value = Value(default_value, output_field=null_char_field)
operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef('ip'))
qs1 = (
AllowedDevice.objects
.all()
.annotate(
mac=Coalesce(
Subquery(operational_devices_subquery.values('mac')[:1]),
default_value,
output_field=null_char_field,
),
)
)
qs2 = (
OperationalDevice.objects
.exclude(
ip__in=qs1.values('ip'),
)
.annotate(
type=default_value,
owner=default_value,
)
)
final_qs = qs1.union(qs2)
多个字段的通用方法
更复杂但 "universal" 的方法可以使用 Model._meta.get_fields()
。对于 "second" 模型有超过 1 个额外字段(不仅是 ip,mac
)的情况,使用起来会更容易。示例代码(未测试,但给出一般印象):
# One more import:
from django.db.models.fields import NOT_PROVIDED
common_field_name = 'ip'
# OperationalDevice fields: ip, mac, some_more_fields ...
# AllowedDevice fields: ip, type, owner
operational_device_fields = OperationalDevice._meta.get_fields()
operational_device_fields_names = {_f.name for _f in operational_device_fields} # or set((_f.name for ...))
allowed_device_fields = AllowedDevice._meta.get_fields()
allowed_device_fields_names = {_f.name for _f in allowed_device_fields} # or set((_f.name for ...))
operational_devices_subquery = OperationalDevice.objects.filter(ip=OuterRef(common_field_name))
left_joined_qs = ( # "Kind-of". Assuming AllowedDevice to be "left" and OperationalDevice to be "right"
AllowedDevice.objects
.all()
.annotate(
**{
_f.name: Coalesce(
Subquery(operational_devices_subquery.values(_f.name)[1]),
Value(_f.get_default()), # Use defaults from model definition
output_field=_f,
)
for _f in operational_device_fields
if _f.name not in allowed_device_fields_names
# NOTE: if fields other than `ip` "overlap", then you might consider
# changing logic here. Current implementation keeps fields from the
# AllowedDevice
}
# Unpacked dict is partially equivalent to this:
# mac=Coalesce(
# Subquery(operational_devices_subquery.values('mac')[:1]),
# default_for_mac_eg_fallback_text_value,
# output_field=null_char_field,
# ),
# other_field = Coalesce(...),
# ...
)
)
lonely_right_rows_qs = (
OperationalDevice.objects
.exclude(
ip__in=AllowedDevice.objects.all().values(common_field_name),
)
.annotate(
**{
_f.name: Value(_f.get_default(), output_field=_f), # Use defaults from model definition
for _f in allowed_device_fields
if _f.name not in operational_device_fields_names
# NOTE: See previous NOTE
}
)
)
final_qs = left_joined_qs.union(lonely_right_rows_qs)
将 OneToOneField 用于 "better" SQL
理论上您可以在 AllowedDevice
中使用 device_info = models.OneToOneField(OperationalDevice, db_column='ip', primary_key=True, related_name='status_info')
:。在这种情况下,您的第一个 QS 可以在不使用 Subquery
:
from django.db.models import F
# Now 'ip' is not in field names ('device_info' is there), so add it:
allowed_device_fields_names.add(common_field_name)
# NOTE: I think this approach will result in a more compact SQL query without
# multiple `(SELECT "some_field" FROM device_info_table ... ) as "some-field"`.
# This also might result in better query performance.
honest_join_qs = (
AllowedDevice.objects
.all()
.annotate(
**{
_f.name: F(f'device_info__{_f.name}')
for _f in operational_device_fields
if _f.name not in allowed_device_fields_names
}
)
)
final_qs = honest_join_qs.union(lonely_right_rows_qs)
# or:
# final_qs = honest_join_qs.union(
# OperationalDevice.objects.filter(status_info__isnull=True).annotate(**missing_fields_annotation)
# )
# I'm not sure which approach is better performance-wise...
# Commented one will use something like:
# `SELECT ... FROM "device_info_table" LEFT OUTER JOIN "status_info_table" ON ("device_info_table"."ip" = "status_info_table"."ip") WHERE "status_info_table"."ip" IS NULL
#
# So it might be a little better than first with `union(QS.exclude(ip__in=honest_join_qs.values('ip'))`.
# Because later uses SQL like this:
# `SELECT ... FROM "device_info_table" WHERE NOT ip IN (SELECT ip FROM "status_info_table")`
#
# But it's better to measure timings of both approaches to be sure.
# @GrannyAching, can you compare them and tell in the comments which one is better ?
P.S。要自动化模型定义,您可以使用 manage.py inspectdb
P.P.S。也许 multi-table inheritance 与自定义 OneToOneField(..., parent_link=True)
比使用 union
.