有效地比较 Scala 集合中两个的所有组合

Efficiently compare all combinations of two in a Scala collection

使用 OpenFlow 我想计算有多少对流满足以下条件(给定任意两个流,例如 f1f2):

我认为这将是两个元素的组合 (NFlows Choose 2)。感谢 this question 我正在使用如下方法 combinations

val pairFlows = matchs.combinations(2).filter{ list =>
  val f1 = list.head
  val f2 = list.tail.head

  f1.dl_type == f2.dl_type &&
  f1.nw_dst == f2.nw_src &&
  f1.nw_src == f2.nw_dst
}

我的问题是,这是执行此计算的最有效方法吗?或者创建一个 HashMap 以跟踪每个 srcdstprotocol type 会更好吗?

这里是数据的一个例子:

{
    1: [
        {
            "actions": [
                "OUTPUT:2"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 18,
            "hard_timeout": 0,
            "byte_count": 1764,
            "duration_sec": 6884,
            "duration_nsec": 5000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:02",
                "nw_src": "10.0.0.1",
                "in_port": 1,
                "nw_dst": "10.0.0.2"
            }
        },
        {
            "actions": [
                "OUTPUT:2"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 18,
            "hard_timeout": 0,
            "byte_count": 1764,
            "duration_sec": 1123,
            "duration_nsec": 181000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:02",
                "nw_src": "10.0.0.3",
                "in_port": 3,
                "nw_dst": "10.0.0.2"
            }
        },
        {
            "actions": [
                "OUTPUT:1"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 10,
            "hard_timeout": 0,
            "byte_count": 980,
            "duration_sec": 1132,
            "duration_nsec": 253000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:01",
                "nw_src": "10.0.0.3",
                "in_port": 3,
                "nw_dst": "10.0.0.1"
            }
        },
        {
            "actions": [
                "OUTPUT:1"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 16,
            "hard_timeout": 0,
            "byte_count": 1568,
            "duration_sec": 1141,
            "duration_nsec": 652000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:01",
                "nw_src": "10.0.0.2",
                "in_port": 2,
                "nw_dst": "10.0.0.1"
            }
        },
        {
            "actions": [
                "OUTPUT:3"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 20,
            "hard_timeout": 0,
            "byte_count": 1960,
            "duration_sec": 6883,
            "duration_nsec": 961000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:03",
                "nw_src": "10.0.0.2",
                "in_port": 2,
                "nw_dst": "10.0.0.3"
            }
        },
        {
            "actions": [
                "OUTPUT:3"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 12,
            "hard_timeout": 0,
            "byte_count": 1176,
            "duration_sec": 6883,
            "duration_nsec": 984000000,
            "priority": 1,
            "length": 120,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 2048,
                "dl_dst": "00:00:00:00:00:03",
                "nw_src": "10.0.0.1",
                "in_port": 1,
                "nw_dst": "10.0.0.3"
            }
        },
        {
            "actions": [
                "OUTPUT:CONTROLLER"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 0,
            "hard_timeout": 0,
            "byte_count": 0,
            "duration_sec": 9156,
            "duration_nsec": 252000000,
            "priority": 65535,
            "length": 96,
            "flags": 0,
            "table_id": 0,
            "match": {
                "dl_type": 35020,
                "dl_dst": "01:80:c2:00:00:0e"
            }
        },
        {
            "actions": [
                "OUTPUT:CONTROLLER"
            ],
            "idle_timeout": 0,
            "cookie": 0,
            "packet_count": 22,
            "hard_timeout": 0,
            "byte_count": 1260,
            "duration_sec": 9156,
            "duration_nsec": 268000000,
            "priority": 0,
            "length": 80,
            "flags": 0,
            "table_id": 0,
            "match": {}
        }
    ],
}

这里有 8 个流,其中 3 个是成对的。

我终于把代码写成了O(n),我保留了一个Mapf.nw_src + f.nw_dst + f.dl_type和一个值Int表示那个键是否有对应的pair flow (这将是带有键 f.nw_dst + f.nw_src + f.dl_type 的那个。这是最终代码:

val table =  flows./:(Map.empty[String,Int]){ case (m,f) =>
  val key = f.nw_src + f.nw_dst + f.dl_type
  val inverseKey = f.nw_dst + f.nw_src + f.dl_type
  val haspair = m get inverseKey match {
    case Some(v) => v + 1
    case None => 0
  }
  m + (key -> haspair)
}

val pairs = table.filter(_._2 > 0)

希望这对寻找类似问题的人有所帮助。