__repr__ 与集合 class Python 的最佳实践是什么?

What are the best practices for __repr__ with collection class Python?

我有一个自定义的Pythonclass,它本质上封装了某种对象的list,我想知道我应该如何实现它的__repr__功能.我很想选择以下内容:

class MyCollection:
   def __init__(self, objects = []):
      self._objects = []
      self._objects.extend(objects)

   def __repr__(self):
      return f"MyCollection({self._objects})"

这具有生成有效 Python 输出的优势,该输出完整描述了 class 实例。但是,在我的真实情况下,对象列表可能相当大,每个对象本身可能有一个很大的 repr(它们本身就是数组)。

在这种情况下,最佳做法是什么?接受 repr 可能通常是一个很长的字符串?是否存在与此相关的潜在问题(调试器 UI 等)?我应该使用分号实施某种缩短方案吗?如果是这样,是否有 good/standard 方法来实现这一点?还是我应该完全跳过列出集合的内容?

官方文档将此概述为您应该如何处理 __repr__:

Called by the repr() built-in function to compute the “official” string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned. The return value must be a string object. If a class defines __repr__() but not __str__(), then __repr__() is also used when an “informal” string representation of instances of that class is required.

This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

https://docs.python.org/3/reference/datamodel.html#object.\_\_repr__

列表、字符串、集合、元组和字典都在它们的 __repr__ 方法中打印出它们的整个集合。

您当前的代码看起来完全符合文档建议的示例。尽管我建议更改您的 __init__ 方法,使其看起来更像这样:

class MyCollection:
   def __init__(self, objects=None):
       if objects is None:
           objects = []
      self._objects = objects

   def __repr__(self):
      return f"MyCollection({self._objects})"

您通常希望避免使用可变对象作为默认参数。从技术上讲,由于您的方法是使用 extend(创建列表的副本)实现的,它仍然可以很好地工作,但是 Python 的文档仍然建议您避免这种情况。

It is good programming practice to not use mutable objects as default values. Instead, use None as the default value and inside the function, check if the parameter is None and create a new list/dictionary/whatever if it is.

https://docs.python.org/3/faq/programming.html#why-are-default-values-shared-between-objects

如果您对另一个库如何以不同方式处理它感兴趣,当数组长度大于 1,000 时,Numpy 数组的 repr 仅显示前三项和后三项。它还格式化项目,使它们都使用相同数量的 space(在下面的示例中,1000 占用四个 space,因此 0 必须用另外三个 space 填充以匹配)。

>>> repr(np.array([i for i in range(1001)]))
'array([   0,    1,    2, ...,  998,  999, 1000])'

要模仿这种 numpy 数组样式,您可以在 class:

中实现这样的 __repr__ 方法
class MyCollection:
   def __init__(self, objects=None):
      if objects is None:
          objects = []
      self._objects = objects

   def __repr__(self):
       # If length is less than 1,000 return the full list.
      if len(self._objects) < 1000:
          return f"MyCollection({self._objects})"
      else:
          # Get the first and last three items
          items_to_display = self._objects[:3] + self._objects[-3:]
          # Find the which item has the longest repr
          max_length_repr = max(items_to_display, key=lambda x: len(repr(x)))
          # Get the length of the item with the longest repr
          padding = len(repr(max_length_repr))
          # Create a list of the reprs of each item and apply the padding
          values = [repr(item).rjust(padding) for item in items_to_display]
          # Insert the '...' inbetween the 3rd and 4th item
          values.insert(3, '...')
          # Convert the list to a string joined by commas
          array_as_string = ', '.join(values)
          return f"MyCollection([{array_as_string}])"

>>> repr(MyCollection([1,2,3,4]))
'MyCollection([1, 2, 3, 4])'

>>> repr(MyCollection([i for i in range(1001)]))
'MyCollection([   0,    1,    2, ...,  998,  999, 1000])'