与 (documentdb)SQL API 相比，在 Cosmos DB 中使用图表 API 非常慢

Question

给定一个 CosmosDB 设置图API。具有约 4k 个顶点和约 10k 个边的图，从 GraphAPI 和 DocumentAPI 到同一数据库的类似查询显示显着不同的运行次。我一直在使用以下节点应用程序测试 API 之间的区别：

var Gremlin = require('gremlin');
var config = require("./config");
var documentdb = require('documentdb');

const docClient = new documentdb.DocumentClient(....);
const graphClient = Gremlin.createClient(....);


const start = new Date();
graphClient.execute('g.V("12345")', {}, (err, results) => {
    const end = new Date();
    if (err) {
        return console.error(err);
    }

    console.log(`GraphDB API Results in: ${(end.getTime() - start.getTime()) / 1000}`);
});

var querySpec = {
    query: 'SELECT * FROM c ' +
           'WHERE c.id = "12345"',

};
const docStart = new Date();
docClient.queryDocuments("dbs/graphdb/colls/sn", querySpec).toArray((err, results) => {
    const docEnd = new Date();
    if (err) {
        console.error(JSON.stringify(err, null, 2));
        return;
    }

    console.log(`DocumentDB API Results in: ${(docEnd.getTime() - docStart.getTime()) / 1000}`)
});

此代码的输出显示正在查询的单个文档在约 1.8 秒内由 GraphAPI 编辑，而文档是从 return 编辑的documentdb api 在 ~0.3 秒内。

DocumentDB API 结果：

[
  {
    "label": "company",
    "id": "12345",
    "parent": [
      {
        "_value": "54321",
        "id": "de7c87f7-83db-43c2-8ddd-c5487dd5682e"
      }
    ],
    "name": [
      {
        "_value": "Acme Co",
        "id": "b4316415-d5c3-4dcc-ac5f-64b1d8c8bd62"
      }
    ],
    "_rid": "KPk3APUeEgFcAAAAAAAAAA==",
    "_self": "dbs/KPk3AA==/colls/KPk3APUeEgE=/docs/KPk3APUeEgFcAAAAAAAAAA==/",
    "_etag": "\"0000df07-0000-0000-0000-5a2b23bd0000\"",
    "_attachments": "attachments/",
    "_ts": 1512776637
  }
]

GraphDB API 结果：

[
  {
    "id": "12345",
    "label": "company",
    "type": "vertex",
    "properties": {
      "parent": [
        {
          "id": "de7c87f7-83db-43c2-8ddd-c5487dd5682e",
          "value": "54321"
        }
      ],
      "name": [
        {
          "id": "b4316415-d5c3-4dcc-ac5f-64b1d8c8bd62",
          "value": "Acme Co"
        }
      ]
    }
  }
]

所有这些示例都在一个固定大小的集合中，RU 一直变为 10,000。

我是不是做错了什么？我需要制作 better/more/fewer 索引吗？无论查询结构如何，像 Cosmos 这样的云规模数据库都不能在不到一秒的时间内 return 单个文档，这似乎很疯狂。

我有简单遍历的例子 (g.V().hasLabel('x').out('y').hasLabel('z'))当 hasLabel('x') 计数为 ~40 时，return 需要 5 秒。如果 hasLabel('x') 计数为 ~1000，则遍历需要超过 15 秒才能到达 return。这对我来说似乎很慢。

我四处寻找任何性能数据，但没有找到任何示例。归根结底，我是不是对这项技术期望太高了？

Answer 1

感谢 MS 解决问题。他们推出 gremlin API 端点时出现了一些问题。我的实例正在从导致问题的数据库实例调用不同区域中的 gremlin 端点（如果我正确理解来自 MS 的消息）。

我得到了一个功能标志，可以在门户上设置，以强制在他们的新基础架构上部署新数据库。

我现在看到所有查询和遍历的响应时间都低于 500 毫秒。

与 (documentdb)SQL API 相比，在 Cosmos DB 中使用图表 API 非常慢

Using the Graph API in Cosmos DB is VERY slow compared to the (documentdb)SQL API

graph-databases

azure-cosmosdb