为什么使用 < 运算符的向量比较会对每个项目进行两次比较？

Question

在本例中，比较两个带有 < 运算符的向量会得到运算符 <，定义在整数 class 上，每个元素被调用两次。但是，使用 == 运算符比较两个向量时不会发生这种情况。

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value) : m_value(value) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " < " << rhs.m_value << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << " == " << rhs.m_value << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {1,2,3};
    std::vector<Integer> ivec2 {1,2,3};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

此代码打印：

为什么会这样？

Answer 1

如果a < b returns false，它不会告诉你是否b < a，你必须测试它。这是因为 std::vector 的逐个元素排序对于一对元素 a, b:

可以有三个结果

a < b,向量比较returnstrue.
b < a,向量比较returnsfalse.
以上都不是，必须测试下一对元素。

所以它必须比较两个方向。通过向 class:

添加识别数据，您可以更清楚地看到这一点

#include<iostream>
#include<vector>

class Integer {
    public:
        Integer(int value, char tag) : m_value(value), m_tag(tag) {}
        friend bool operator<(const Integer& lhs, const Integer& rhs);
        friend bool operator==(const Integer& lhs, const Integer& rhs);

    private:
        int m_value;
        char m_tag;

}; 
bool operator<(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " < " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value < rhs.m_value;
}
bool operator==(const Integer& lhs, const Integer& rhs) {
            std::cout << lhs.m_value << ' ' << lhs.m_tag << " == " << rhs.m_value << ' ' << rhs.m_tag << '\n';
            return lhs.m_value == rhs.m_value;
}


int main()
{
    std::vector<Integer> ivec1 {{1, 'a'} ,{2, 'a'}, {3, 'a'}};
    std::vector<Integer> ivec2 {{1, 'b'} ,{2, 'b'}, {3, 'b'}};
    std::cout << (ivec1 < ivec2) << '\n';
    std::cout << (ivec1 == ivec2) << std::endl;
    return 0;
}

这会产生：

1 a < 1 b    
1 b < 1 a    
2 a < 2 b    
2 b < 2 a    
3 a < 3 b    
3 b < 3 a    
0    
1 a == 1 b    
2 a == 2 b    
3 a == 3 b    
1

[Live example]

Answer 2

为了查找 ivec1 和 ivec2 的字典顺序，该实现查找 ivec1[i] < ivec2[i] 或 ivec2[i] < ivec1[i] 的第一个索引 i ，因为这将决定顺序。

请注意，如果 ivec1[i] < ivec2[i] 为假，这需要进行两次比较。特别地，上述情况留下两种可能性，即“ivec1[i]和ivec2[i]比较等价”和“ivec2[i] < ivec1[i]”。这个决定是需要第二次比较的地方。

一旦找到这样的索引i，执行就可以停止比较；但是由于在您的示例中所有条目都比较相等，因此必须对每对条目执行两次比较。

Answer 3

这是由于 C++ 当前处理比较的设计存在缺陷。他们正在 c++2a 修复它；不知道能不能到vector，但是基本问题会解决。

首先是问题

std::vector 的 < 基于每个元素 <。但是 < 对于这项工作来说是一个糟糕的工具。

如果您有两个元素 a 和 b，要按字典顺序对元组 a,b 排序，您需要执行以下操作：

if (self.a < other.a)
  return true;
if (other.a < self.a)
  return false;
return self.b < other.b;

一般来说，如果您想按字典顺序排列 N 个元素的集合，这需要调用 2N-1 次 <。

这已经知道很长时间了，这就是为什么 strcmp returns 一个具有 3 种值的整数：-1 小于，0 等于和 +1 表示更大（或者，通常是小于、等于或大于零的值）。

你可以这样做：

auto val = compare( self.a, other.a );
if (val != 0) return val;
return compare( self.b, other.b );

这需要对集合中的每个元素最多 N 次调用 compare。

现在，修复。

c++2a 添加飞船比较运算符 <=>。它 returns 一种可以比较大于或小于零的类型，其确切类型通告了操作提供的保证。

这类似于 C 的 strcmp，但适用于任何支持它的类型。此外，还有 std 函数使用 <=>（如果可用），否则使用 < 和 == 等来模拟它。

假设 vector 的要求被重写为使用 <=>，具有 <=> 的类型将避免双重比较并且每个最多只被 <=> 一次以进行 lexographic 排序调用 < 时的 std::vector。

为什么使用 < 运算符的向量比较会对每个项目进行两次比较？

Why does vector comparison with < operator compare each item twice?

c++

operator-overloading

stdvector