使用原始和 tz 感知日期时间实例测试相等性时的意外行为

Question

以下是在Python 3.9.7.

中制作的

我很清楚在 Python 中不允许比较 tz 感知和天真的 datetime 实例并引发 TypeError。但是，在测试相等性时（使用 == 和 != 运算符）实际上并非如此。其实比较总是returns False:

import datetime
import pytz

t_tz_aware = datetime.datetime(2020, 5, 23, tzinfo=pytz.UTC)
t_naive = datetime.datetime(2020, 5, 23)

# Prints 'False'.
print(t_tz_aware == t_naive)

# Raises TypeError: can't compare offset-naive and offset-aware datetimes.
print(t_tz_aware < t_naive)

我检查了 source code of the datetime library 并且比较日期时间对象的函数有一个名为 allow_mixed 的参数（默认为 False）：

def _cmp(self, other, allow_mixed=False)

当设置为 True 时，即使用 == 运算符进行比较时的情况，可以比较 tz-aware 和 naive datetime 实例。否则，它会引发 TypeError:

# When testing for equality, set allow_mixed to True.
# For all the other operators, it remains False.
def __eq__(self, other):
   if isinstance(other, datetime):
      return self._cmp(other, allow_mixed=True) == 0

if myoff is None or otoff is None:
   if allow_mixed:
      return 2 # arbitrary non-zero value
   else:
      raise TypeError("cannot compare naive and aware datetimes")

所以，这看起来确实是有意为之的行为。其实Pandas'实现pandas.Timestamps和类似的比较也是符合这个的。

我的问题是，推理是什么？我想，就像参数的名称所说的那样，我们可以通过这种方式过滤包含原始实例和 tz 感知实例（即“混合”）的 datetime 对象的集合。但这不会引入潜在错误和意外行为的来源吗？我错过了什么？

编辑在 deceze 的评论之后：这实际上仍然是“语义正确的”（即日期肯定不同）。

Answer 1

相等比较几乎不会引发异常。你只想知道object A是否等于object B，答案很明确。如果它们相等，但是这些对象定义相等，答案是 yes，在所有其他情况下它是 no。一个天真的 datetime 与一个有意识的人的区别至少在于一个有时区而另一个没有，所以他们显然不相等。

虽然greater-than/lower-than比较，对象显然需要可订购，否则无法回答问题。 < 比较不能 return False 只是因为无法比较对象，因为那意味着相反的 >= 比较应该 return True，它也不能。所以在这种情况下提出错误是正确的第三种可能结果。

Answer 2

正如在 docs 中所见，直到 Python 3.2 在这些情况下实际提出了 TypeError：

Changed in version 3.3: Equality comparisons between aware and naive datetime instances don’t raise TypeError.

2012 年，Python 开发人员考虑了这两个问题之间的 trade-off：

提出一个 TypeError 可以更容易地捕获由混合天真和有意识的 datetime 对象的严重错误引起的错误。
在 Python 中，您几乎可以对任何对象组合使用相等比较。仅对 datetime 个对象提出 TypeError 会破坏这种一致性。

下面是Python developer mailing list的相关讨论：

This is nice when your datetime objects are freshly created. It is not so nice when some of them already exist e.g. in a database (using an ORM layer). Mixing naive and aware datetimes is currently a catastrophe, since even basic operations such as equality comparison fail with a TypeError (it must be pretty much the only type in the stdlib with such poisonous behaviour).

比较有意识的和天真的日期时间对象没有多大意义，但这是一个容易犯的错误。我会说 TypeError 是一个明智的在简单地返回 False 时警告你的方法可能会导致很多混乱。

你可以对同样“令人困惑”的结果说同样的话，但平等永远不会引发类型错误（日期时间实例之间除外）：
>>> () == []
False
引发异常具有非常严重的影响，例如使它不可能将这些对象放在同一个字典中。
离家更近，
>>> date(2012,6,1) == datetime(2012,6,1)
`False`
我同意，相等比较不应引发异常。
我们就这样吧。

-- --Guido van Rossum (python.org/~guido)

看起来在这次交流中关于删除异常的论据更有力。 Guido van Rossum 是 Python 语言的创造者，对此类问题拥有最终决定权。这就是为什么他曾经被称为 benevolent dicator for life。因此，在他的“让我们做到这一点”之后，行为发生了变化，因此天真且有意识的 datetime 对象总是比较不相等，而不是引发 TypeError.

使用原始和 tz 感知日期时间实例测试相等性时的意外行为

Unexpected behaviour when testing equality with naive and tz-aware datetime instances

python

comparison

timezone

datetime

pandas