在 ArangoDB 中，是否会在 O(n) 内完成来自邻居的带有过滤器的查询？

Question

我一直在阅读 Aql Graph Operation and Graphs, and have found no concrete example and performance explanation for the use case of SQL-Traverse。

例如：

如果我有一个集合 Users，它与集合 Company 有 company 关系

集合公司与集合位置有关系位置；

集合位置是城市、国家或地区，并且具有关系城市、国家, region 到它自己。

现在，我想查询属于德国或欧盟公司的所有用户。

SELECT from Users where Users.company.location.city.country.name="Germany";
SELECT from Users where Users.company.location.city.parent.name="Germany";

或

SELECT from Users where Users.company.location.city.country.region.name="europe";
SELECT from Users where Users.company.location.city.parent.parent.name="europe";

假设 Location.name 被索引，我可以用 O(n) 执行上面的两个查询吗？ n 是 Location 中的文档数（O(1) 对于图遍历，O(n) 用于索引扫描)?

当然，我可以直接在company中保存regionName或countryName，因为这些城市和国家都在欧盟，不像……其他地方，可能不会改变，但是如果……你知道我的意思怎么办（开玩笑，如果我有其他需要不断更新的用例怎么办）

Answer 1

我要解释一下 using the ArangoDB 2.8 Traversals。

我们创建这些集合以匹配您的 shema 使用 arangosh:

db._create("countries")
db.countries.save({_key:"Germany", name: "Germany"})
db.countries.save({_key:"France", name: "France"})
db.countries.ensureHashIndex("name")

db._create("cities")
db.cities.save({_key: "Munich"})
db.cities.save({_key: "Toulouse")

db._create("company")
db.company.save({_key: "Siemens"})
db.company.save({_key: "Airbus"})

db._create("employees")
db.employees.save({lname: "Kraxlhuber", cname: "Xaver", _key: "user1"})
db.employees.save({lname: "Heilmann",  cname: "Vroni",  _key: "user2"})
db.employees.save({lname: "Leroy",  cname: "Marcel",  _key: "user3"})

db._createEdgeCollection("CityInCountry")
db._createEdgeCollection("CompanyIsInCity")
db._createEdgeCollection("WorksAtCompany")


db.CityInCountry.save("cities/Munich", "countries/Germany", {label: "beautiful South near the mountains"})
db.CityInCountry.save("cities/Toulouse", "countries/France", {label: "crowded city at the mediteranian Sea"})

db.CompanyIsInCity.save("company/Siemens", "cities/Munich", {label: "darfs ebbes gscheits sein? Oder..."})
db.CompanyIsInCity.save("company/Airbus", "cities/Toulouse", {label: "Big planes Ltd."})


db.WorksAtCompany.save("employees/user1", "company/Siemens", {employeeOfMonth: true})
db.WorksAtCompany.save("employees/user2", "company/Siemens", {veryDiligent: true})
db.WorksAtCompany.save("employees/user3", "company/Eurocopter", {veryDiligent: true})

在 AQL 中，我们会以相反的方式编写此查询。我们从索引属性 name 上的常量时间 FILTER 开始，然后从那里开始我们的遍历。因此我们过滤国家“德国”：

db._explain("FOR country IN countries FILTER country.name == 'Germany' RETURN country ")
Query string:
 FOR country IN countries FILTER country.name == 'Germany' RETURN country 

Execution plan:
 Id   NodeType        Est.   Comment
  1   SingletonNode      1   * ROOT
  6   IndexNode          1     - FOR country IN countries   /* hash index scan */
  5   ReturnNode         1       - RETURN country

Indexes used:
 By   Type   Collection   Unique   Sparse   Selectivity   Fields       Ranges
  6   hash   countries    false    false        66.67 %   [ `name` ]   country.`name` == "Germany"

Optimization rules applied:
 Id   RuleName
  1   use-indexes
  2   remove-filter-covered-by-index

现在我们有了经过良好过滤的起始节点，我们可以反向进行图遍历。因为我们知道 Employees 距离起始顶点正好 3 步，而且我们对路径不感兴趣，所以我们只 return 第 3 层：

db._query("FOR country IN countries FILTER country.name == 'Germany' FOR v IN 3 INBOUND country CityInCountry, CompanyIsInCity, WorksAtCompany  RETURN v")

[ 
  { 
    "cname" : "Xaver", 
    "lname" : "Kraxlhuber", 
    "_id" : "employees/user1", 
    "_rev" : "1286703864570", 
    "_key" : "user1" 
  }, 
  { 
    "cname" : "Vroni", 
    "lname" : "Heilmann", 
    "_id" : "employees/user2", 
    "_rev" : "1286729095930", 
    "_key" : "user2" 
  } 
]

关于此查询性能的一些话：

我们定位德国使用的是hash索引是常数时间-> O(1)
基于此我们要遍历m条路径其中m是[=中的员工数36=]德国；它们中的每一个都可以在常数时间内遍历。 -> O(m) 在这一步。
Return 恒定时间的结果 -> O(1)

所有组合我们需要 O(m) 我们期望 m 小于 n（员工人数）在您的 SQL-Traversal 中使用。

在 ArangoDB 中，是否会在 O(n) 内完成来自邻居的带有过滤器的查询？

In ArangoDB, will querying, with filters, from the neighbor(s) be done in O(n)?

graph-databases

arangodb