class 个对象的选择性比较
Selective comparison of class objects
我需要对 class 个对象进行多重比较。但是,只有选定字段的值才能进行比较,即:
class Class:
def __init__(self, value1, value2, value3, dummy_value):
self.field1 = value1
self.field2 = value2
self.field3 = value3
self.irrelevant_field = dummy_value
obj1 = Class(1, 2, 3, 'a')
obj2 = Class(1, 2, 3, 'b') #compare(obj1, obj2) = True
obj3 = Class(1, 2, 4, 'a') #compare(obj1, obj3) = False
目前我是这样做的:
def dumm_compare(obj1, obj2):
if obj1.field1 != obj2.field1:
return False
if obj1.field2 != obj2.field2:
return False
if obj1.field3 != obj2.field3:
return False
return True
由于我的实际相关字段数大于10,这种方法导致代码非常庞大。这就是为什么我尝试这样的事情:
def cute_compare(obj1, obj2):
for field in filter(lambda x: x.startswith('field'), dir(obj1)):
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
代码紧凑;但是,性能会受到很大影响:
import time
starttime = time.time()
for i in range(100000):
dumm_compare(obj1, obj2)
print('Dumm compare runtime: {:.3f} s'.format(time.time() - starttime))
starttime = time.time()
for i in range(100000):
cute_compare(obj1, obj2)
print('Cute compare runtime: {:.3f} s'.format(time.time() - start time))
#Dumm compare runtime: 0.046 s
#Cute compare runtime: 1.603 s
有没有更有效地实现选择性对象比较的方法?
编辑:
事实上,我需要几个这样的函数(它们通过不同的、有时重叠的字段集来比较对象)。这就是我不想覆盖内置 class 方法的原因。
dir()
不仅包含实例属性,而且还会遍历 class 层次结构。因此,它所做的工作比这里需要的多得多; dir()
真的只适合调试任务
坚持使用vars()
instead, perhaps combined with any()
:
def faster_compare(obj1, obj2):
obj2_vars = vars(obj2)
return all(value == obj2_vars[field]
for field, value in vars(obj1).items() if field.startswith('field'))
vars()
returns 仅包含实例属性的字典;在上面的生成器表达式中,我使用 dict.items()
方法一步访问属性名称及其值。
我替换了 obj2
的 getattr()
方法调用以使用相同的字典方法,这节省了每次帧堆栈推送和弹出,因为可以在字节码(C 代码)中处理键查找完全。请注意,这确实假设您没有使用属性;只会列出实际的实例属性。
与硬编码 if
分支相比,此方法仍然需要做更多的工作,但它至少不会执行所有糟糕的操作:
>>> from timeit import timeit
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, dumm_compare as compare')
0.349234500026796
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, cute_compare as compare')
16.48695448896615
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, faster_compare as compare')
1.9555692840367556
如果某个特定比较集中的所有实例都存在这些字段,
尝试保存列表以与 class.
进行比较
def prepped_compare(obj1, obj2):
li_field = getattr(obj1, "li_field", None)
if li_field is None:
#grab the list from the compare object, but this assumes a
#fixed fieldlist per run.
#mind you getattr(obj,non-existentfield) blows up anyway
#so y'all making that assumption already
li_field = [f for f in vars(obj1) if f.startswith('field')]
obj1.__class__.li_field = li_field
for field in li_field:
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
或者在外面预先计算,更好
def prepped_compare2(obj1, obj2, li_field):
for field in li_field:
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
starttime = time.time()
li_field = [f for f in vars(obj1) if f.startswith('field')]
for i in range(100000):
prepped_compare2(obj1, obj2, li_field)
print('prepped2 compare runtime: {:.3f} s'.format(time.time() - starttime))
输出:
Dumm compare runtime: 0.051 s
Cute compare runtime: 0.762 s
prepped compare runtime: 0.122 s
prepped2 compare runtime: 0.093 s
回复。覆盖 eq,我很确定你会有类似的东西。
def mycomp01(self, obj2) #possibly with a saved field list01 on the class
def mycomp02(self, obj2) #possibly with a saved field list02 on the class
#let's do comp01.
Class.__eq__ = mycomp01
run comp01 tests
Class.__eq__ = mycomp02
run comp02 tests
我需要对 class 个对象进行多重比较。但是,只有选定字段的值才能进行比较,即:
class Class:
def __init__(self, value1, value2, value3, dummy_value):
self.field1 = value1
self.field2 = value2
self.field3 = value3
self.irrelevant_field = dummy_value
obj1 = Class(1, 2, 3, 'a')
obj2 = Class(1, 2, 3, 'b') #compare(obj1, obj2) = True
obj3 = Class(1, 2, 4, 'a') #compare(obj1, obj3) = False
目前我是这样做的:
def dumm_compare(obj1, obj2):
if obj1.field1 != obj2.field1:
return False
if obj1.field2 != obj2.field2:
return False
if obj1.field3 != obj2.field3:
return False
return True
由于我的实际相关字段数大于10,这种方法导致代码非常庞大。这就是为什么我尝试这样的事情:
def cute_compare(obj1, obj2):
for field in filter(lambda x: x.startswith('field'), dir(obj1)):
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
代码紧凑;但是,性能会受到很大影响:
import time
starttime = time.time()
for i in range(100000):
dumm_compare(obj1, obj2)
print('Dumm compare runtime: {:.3f} s'.format(time.time() - starttime))
starttime = time.time()
for i in range(100000):
cute_compare(obj1, obj2)
print('Cute compare runtime: {:.3f} s'.format(time.time() - start time))
#Dumm compare runtime: 0.046 s
#Cute compare runtime: 1.603 s
有没有更有效地实现选择性对象比较的方法?
编辑: 事实上,我需要几个这样的函数(它们通过不同的、有时重叠的字段集来比较对象)。这就是我不想覆盖内置 class 方法的原因。
dir()
不仅包含实例属性,而且还会遍历 class 层次结构。因此,它所做的工作比这里需要的多得多; dir()
真的只适合调试任务
坚持使用vars()
instead, perhaps combined with any()
:
def faster_compare(obj1, obj2):
obj2_vars = vars(obj2)
return all(value == obj2_vars[field]
for field, value in vars(obj1).items() if field.startswith('field'))
vars()
returns 仅包含实例属性的字典;在上面的生成器表达式中,我使用 dict.items()
方法一步访问属性名称及其值。
我替换了 obj2
的 getattr()
方法调用以使用相同的字典方法,这节省了每次帧堆栈推送和弹出,因为可以在字节码(C 代码)中处理键查找完全。请注意,这确实假设您没有使用属性;只会列出实际的实例属性。
与硬编码 if
分支相比,此方法仍然需要做更多的工作,但它至少不会执行所有糟糕的操作:
>>> from timeit import timeit
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, dumm_compare as compare')
0.349234500026796
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, cute_compare as compare')
16.48695448896615
>>> timeit('compare(obj1, obj2)', 'from __main__ import obj1, obj2, faster_compare as compare')
1.9555692840367556
如果某个特定比较集中的所有实例都存在这些字段, 尝试保存列表以与 class.
进行比较def prepped_compare(obj1, obj2):
li_field = getattr(obj1, "li_field", None)
if li_field is None:
#grab the list from the compare object, but this assumes a
#fixed fieldlist per run.
#mind you getattr(obj,non-existentfield) blows up anyway
#so y'all making that assumption already
li_field = [f for f in vars(obj1) if f.startswith('field')]
obj1.__class__.li_field = li_field
for field in li_field:
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
或者在外面预先计算,更好
def prepped_compare2(obj1, obj2, li_field):
for field in li_field:
if getattr(obj1, field) != getattr(obj2, field):
return False
return True
starttime = time.time()
li_field = [f for f in vars(obj1) if f.startswith('field')]
for i in range(100000):
prepped_compare2(obj1, obj2, li_field)
print('prepped2 compare runtime: {:.3f} s'.format(time.time() - starttime))
输出:
Dumm compare runtime: 0.051 s
Cute compare runtime: 0.762 s
prepped compare runtime: 0.122 s
prepped2 compare runtime: 0.093 s
回复。覆盖 eq,我很确定你会有类似的东西。
def mycomp01(self, obj2) #possibly with a saved field list01 on the class
def mycomp02(self, obj2) #possibly with a saved field list02 on the class
#let's do comp01.
Class.__eq__ = mycomp01
run comp01 tests
Class.__eq__ = mycomp02
run comp02 tests