如何将 python 集合用于自定义 类
How to use python collections for custom classes
仍然对 python 感到困惑,它是神奇的函数式编程,所以我发现自己编写的代码更倾向于 Java 编程范式,而不是惯用的 Python .
我的问题有点相关:How do I make a custom class a collection in Python
唯一的区别是我有嵌套对象(使用组合)。 VirtualPage 对象由 PhysicalPage 对象的列表组成。我有一个函数可以获取 PhyscialPage 对象的列表,并将所有细节合并到一个命名的元组中,我称之为 PageBoundary。本质上,它是一个序列化函数,可以吐出一个由整数范围组成的元组,该整数范围代表物理页面和页面中的行号。由此我可以轻松地对 VirtualPages 进行排序和排序(至少是这个想法):
PageBoundary = collections.namedtuple('PageBoundary', 'begin end')
我还有一个函数可以使用 PageBoundary namedtuple 并将元组反序列化或扩展为 PhysicalPages 的列表。最好不要更改这两个数据存储 class,因为它会破坏任何下游代码。
这是我的自定义 python2.7 class 的片段。它由很多东西组成,一个是包含对象 PhysicalPage:
的列表
class VirtualPage(object):
def __init__(self, _physical_pages=list()):
self.physcial_pages = _physcial_pages
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, _page_num=-1):
self.page_num = _page_num
self.begin_line_num = -1
self.end_line_num = -1
def get_cannonical_begin(self):
return int(''.join([str(self.page_num).zfill(PhysicalPage._PAGE_PAD),
str(tmp_line_num).zfill(PhysicalPage._LINE_PAD) ]))
def get_cannonical_end(self):
pass # see get_cannonical_begin() implementation
def get_canonical_page_boundaries(self):
return PageBoundary(self.get_canonical_begin(), self.get_canonical_end())
我想利用一些模板化集合(来自 python 集合模块)轻松排序和比较 VirtualPage class 的列表或集合es。还想要一些关于我的数据存储布局的建议 classes:VirtualPage 和 PhysicalPage。
给定一系列 VirtualPages 或如下例所示:
vp_1 = VirtualPage(list_of_physical_pages)
vp_1_copy = VirtualPage(list_of_physical_pages)
vp_2 = VirtualPage(list_of_other_physical_pages)
我想轻松回答这样的问题:
>>> vp_2 in vp_1
False
>>> vp_2 < vp_1
True
>>> vp_1 == vp_1_copy
True
马上就很明显 VirtualPage class 需要调用 get_cannonical_page_boundaries 甚至实现函数本身。至少它应该遍历它的 PhysicalPage 列表来实现所需的功能(lt() 和 eq()) 这样我就可以比较 b/w VirtualPages。
1.) 目前我正在努力实现一些比较功能。一个很大的障碍是如何比较一个元组?我是否通过创建扩展某种类型集合的自定义 class 来创建自己的 lt() 函数:
import collections as col
import functools
@total_ordering
class AbstractVirtualPageContainer(col.MutableSet):
def __lt__(self, other):
'''What type would other be?
Make comparison by first normalizing to a comparable type: PageBoundary
'''
pass
2.) 比较函数的实现应该存在于 VirtualPage class 中吗?
我倾向于某种类型的 Set 数据结构,因为我正在建模的数据的属性具有唯一性的概念:即物理页面值不能重叠并且在某种程度上充当链表。通过@装饰器函数实现的setter或getter函数在这里也有用吗?
我想你想要类似下面代码的东西。未经测试;当然没有针对您的应用程序或您的数据、YMMV 等进行测试
from collections import namedtuple
# PageBoundary is a subclass of named tuple with special relational
# operators. __le__ and __ge__ are left undefined because they don't
# make sense for this class.
class PageBoundary(namedtuple('PageBoundary', 'begin end')):
# to prevent making an instance dict (See namedtuple docs)
__slots__ = ()
def __lt__(self, other):
return self.end < other.begin
def __eq__(self, other):
# you can put in an assertion if you are concerned the
# method might be called with the wrong type object
assert isinstance(other, PageBoundary), "Wrong type for other"
return self.begin == other.begin and self.end == other.end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, page_num):
self.page_num = page_num
# single leading underscore is 'private' by convention
# not enforced by the language
self._begin = self.page_num * 10**PhysicalPage._LINE_PAD + tmp_line_num
#self._end = ...however you calculate this... ^ not defined yet
self.begin_line_num = -1
self.end_line_num = -1
# this serves the purpose of a `getter`, but looks just like
# a normal class member access. used like x = page.begin
@property
def begin(self):
return self._begin
@property
def end(self):
return self._end
def __lt__(self, other):
assert(isinstance(other, PhysicalPage))
return self._end < other._begin
def __eq__(self, other):
assert(isinstance(other, PhysicalPage))
return self._begin, self._end == other._begin, other._end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class VirtualPage(object):
def __init__(self, physical_pages=None):
self.physcial_pages = sorted(physcial_pages) if physical_pages else []
def __lt__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages[-1].end < other.physical_pages[0].begin
else:
raise ValueError
def __eq__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages == other.physical_pages
else:
raise ValueError
def __gt__(self, other):
return other < self
还有一些观察:
虽然在Pythonclasses中没有"private"成员,但是按照约定,变量名以单下划线开头,_
, 表明它不是 class / module/ 等的 public 接口的一部分。因此,用 '_' 命名 public 方法的方法参数似乎不正确, 例如 def __init__(self, _page_num=-1)
.
Python一般不使用setters/getters;直接使用属性就行了。如果需要计算属性值,或者需要其他一些其他处理,请使用 @property
装饰器(如上面的 PhysicalPage.begin() 所示)。
用可变对象初始化默认函数参数通常不是一个好主意。 def __init__(self, physical_pages=list())
不会每次都用新的空列表初始化 physical_pages;相反,它每次都使用相同的列表。如果列表被修改,在下一次函数调用时 physical_pages 将被修改后的列表初始化。请参阅 VirtualPages 初始值设定项以获取替代方法。
仍然对 python 感到困惑,它是神奇的函数式编程,所以我发现自己编写的代码更倾向于 Java 编程范式,而不是惯用的 Python .
我的问题有点相关:How do I make a custom class a collection in Python
唯一的区别是我有嵌套对象(使用组合)。 VirtualPage 对象由 PhysicalPage 对象的列表组成。我有一个函数可以获取 PhyscialPage 对象的列表,并将所有细节合并到一个命名的元组中,我称之为 PageBoundary。本质上,它是一个序列化函数,可以吐出一个由整数范围组成的元组,该整数范围代表物理页面和页面中的行号。由此我可以轻松地对 VirtualPages 进行排序和排序(至少是这个想法):
PageBoundary = collections.namedtuple('PageBoundary', 'begin end')
我还有一个函数可以使用 PageBoundary namedtuple 并将元组反序列化或扩展为 PhysicalPages 的列表。最好不要更改这两个数据存储 class,因为它会破坏任何下游代码。
这是我的自定义 python2.7 class 的片段。它由很多东西组成,一个是包含对象 PhysicalPage:
的列表class VirtualPage(object):
def __init__(self, _physical_pages=list()):
self.physcial_pages = _physcial_pages
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, _page_num=-1):
self.page_num = _page_num
self.begin_line_num = -1
self.end_line_num = -1
def get_cannonical_begin(self):
return int(''.join([str(self.page_num).zfill(PhysicalPage._PAGE_PAD),
str(tmp_line_num).zfill(PhysicalPage._LINE_PAD) ]))
def get_cannonical_end(self):
pass # see get_cannonical_begin() implementation
def get_canonical_page_boundaries(self):
return PageBoundary(self.get_canonical_begin(), self.get_canonical_end())
我想利用一些模板化集合(来自 python 集合模块)轻松排序和比较 VirtualPage class 的列表或集合es。还想要一些关于我的数据存储布局的建议 classes:VirtualPage 和 PhysicalPage。
给定一系列 VirtualPages 或如下例所示:
vp_1 = VirtualPage(list_of_physical_pages)
vp_1_copy = VirtualPage(list_of_physical_pages)
vp_2 = VirtualPage(list_of_other_physical_pages)
我想轻松回答这样的问题:
>>> vp_2 in vp_1
False
>>> vp_2 < vp_1
True
>>> vp_1 == vp_1_copy
True
马上就很明显 VirtualPage class 需要调用 get_cannonical_page_boundaries 甚至实现函数本身。至少它应该遍历它的 PhysicalPage 列表来实现所需的功能(lt() 和 eq()) 这样我就可以比较 b/w VirtualPages。
1.) 目前我正在努力实现一些比较功能。一个很大的障碍是如何比较一个元组?我是否通过创建扩展某种类型集合的自定义 class 来创建自己的 lt() 函数:
import collections as col
import functools
@total_ordering
class AbstractVirtualPageContainer(col.MutableSet):
def __lt__(self, other):
'''What type would other be?
Make comparison by first normalizing to a comparable type: PageBoundary
'''
pass
2.) 比较函数的实现应该存在于 VirtualPage class 中吗?
我倾向于某种类型的 Set 数据结构,因为我正在建模的数据的属性具有唯一性的概念:即物理页面值不能重叠并且在某种程度上充当链表。通过@装饰器函数实现的setter或getter函数在这里也有用吗?
我想你想要类似下面代码的东西。未经测试;当然没有针对您的应用程序或您的数据、YMMV 等进行测试
from collections import namedtuple
# PageBoundary is a subclass of named tuple with special relational
# operators. __le__ and __ge__ are left undefined because they don't
# make sense for this class.
class PageBoundary(namedtuple('PageBoundary', 'begin end')):
# to prevent making an instance dict (See namedtuple docs)
__slots__ = ()
def __lt__(self, other):
return self.end < other.begin
def __eq__(self, other):
# you can put in an assertion if you are concerned the
# method might be called with the wrong type object
assert isinstance(other, PageBoundary), "Wrong type for other"
return self.begin == other.begin and self.end == other.end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class PhysicalPage(object):
# class variables: number of digits each attribute gets
_PAGE_PAD, _LINE_PAD = 10, 12
def __init__(self, page_num):
self.page_num = page_num
# single leading underscore is 'private' by convention
# not enforced by the language
self._begin = self.page_num * 10**PhysicalPage._LINE_PAD + tmp_line_num
#self._end = ...however you calculate this... ^ not defined yet
self.begin_line_num = -1
self.end_line_num = -1
# this serves the purpose of a `getter`, but looks just like
# a normal class member access. used like x = page.begin
@property
def begin(self):
return self._begin
@property
def end(self):
return self._end
def __lt__(self, other):
assert(isinstance(other, PhysicalPage))
return self._end < other._begin
def __eq__(self, other):
assert(isinstance(other, PhysicalPage))
return self._begin, self._end == other._begin, other._end
def __ne__(self, other):
return not self == other
def __gt__(self, other):
return other < self
class VirtualPage(object):
def __init__(self, physical_pages=None):
self.physcial_pages = sorted(physcial_pages) if physical_pages else []
def __lt__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages[-1].end < other.physical_pages[0].begin
else:
raise ValueError
def __eq__(self, other):
if self.physical_pages and other.physical_pages:
return self.physical_pages == other.physical_pages
else:
raise ValueError
def __gt__(self, other):
return other < self
还有一些观察:
虽然在Pythonclasses中没有"private"成员,但是按照约定,变量名以单下划线开头,_
, 表明它不是 class / module/ 等的 public 接口的一部分。因此,用 '_' 命名 public 方法的方法参数似乎不正确, 例如 def __init__(self, _page_num=-1)
.
Python一般不使用setters/getters;直接使用属性就行了。如果需要计算属性值,或者需要其他一些其他处理,请使用 @property
装饰器(如上面的 PhysicalPage.begin() 所示)。
用可变对象初始化默认函数参数通常不是一个好主意。 def __init__(self, physical_pages=list())
不会每次都用新的空列表初始化 physical_pages;相反,它每次都使用相同的列表。如果列表被修改,在下一次函数调用时 physical_pages 将被修改后的列表初始化。请参阅 VirtualPages 初始值设定项以获取替代方法。