Titan/DynamoDb 不会在提交时释放所有获得的锁(通过 gremlin)

Titan/DynamoDb doesn't release all acquired locks on commit (via gremlin)

好吧,我知道这听起来不太可能,我已经做好了被击落的准备,但是现在……

我有一个针对 titanDB 和 dynamoDB(本地)的 gremlin 服务器 运行。我 运行 一些单元测试由于

而不断失败
tx 0x705eafda280e already locked key-column (  8-  0-  0-  0-  0-  0-  0-128, 80-160) when tx 0x70629e1d56bf tried to lock

我 运行 在 gremlin 客户端控制台中针对干净、完全空的数据库(使用 docker 图像在测试运行之间重新创建)执行以下命令。这项工作的目的是支持数据库升级脚本。原始的实际步骤比下面的更完整,但这是重现问题的最低限度。

(Connect to local 'remote')
:remote connect tinkerpop.server conf/remote.yaml

(Add a unique constraint on a 'databaseMetadata' label which has a single 'version' property)
:> mgmt = graph.openManagement();if (!mgmt.getGraphIndex("bydatabaseMetadataversion")) {graph.tx().rollback();int size = graph.getOpenTransactions().size();for (i = 0; i < size; i++) { try { graph.getOpenTransactions().getAt(0).rollback();} catch(Throwable ex) { }; }; mgmt = graph.openManagement();propertyKey = (!mgmt.containsPropertyKey("version")) ? mgmt.makePropertyKey("version").dataType(String.class).cardinality(Cardinality.SINGLE).make():mgmt.getPropertyKey("version");labelObj = (!mgmt.containsVertexLabel("databaseMetadata")) ? mgmt.makeVertexLabel("databaseMetadata").make():mgmt.getVertexLabel("databaseMetadata");index = mgmt.buildIndex("bydatabaseMetadataversion", Vertex.class).addKey(propertyKey).unique().indexOnly(labelObj).buildCompositeIndex();mgmt.setConsistency(propertyKey, ConsistencyModifier.LOCK);mgmt.setConsistency(index, ConsistencyModifier.LOCK);mgmt.commit();mgmt = graph.openManagement();index = mgmt.getGraphIndex("bydatabaseMetadataversion");propertyKey = mgmt.getPropertyKey("version");if (index.getIndexStatus(propertyKey) == SchemaStatus.INSTALLED) {mgmt.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.REGISTERED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call();}; mgmt.commit();mgmt = graph.openManagement();index = mgmt.getGraphIndex("bydatabaseMetadataversion");propertyKey = mgmt.getPropertyKey("version");if (index.getIndexStatus(propertyKey) != SchemaStatus.ENABLED) {mgmt.commit();mgmt = graph.openManagement();mgmt.updateIndex(mgmt.getGraphIndex("bydatabaseMetadataversion"), SchemaAction.ENABLE_INDEX).get();mgmt.commit();mgmt = graph.openManagement();mgmt.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.ENABLED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call();}; mgmt.commit();} else {index = mgmt.getGraphIndex("bydatabaseMetadataversion");propertyKey = mgmt.getPropertyKey("version");if (index.getIndexStatus(propertyKey) != SchemaStatus.ENABLED) {mgmt.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.ENABLED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call();}; mgmt.commit();};

(Add the metadata vertex with initial version '0.0.1')
:> graph.addVertex(label, "databaseMetadata").property("version", "0.0.1");graph.tx().commit();

(Update the metadata vertex with the next version - 0.0.2)
:> g.V().hasLabel("databaseMetadata").has("version", "0.0.1").property("version", "0.0.2").next();g.tx().commit();

(THIS FAILS - Update the metadata vertex with the next version - 0.0.3)
:> g.V().hasLabel("databaseMetadata").has("version", "0.0.2").property("version", "0.0.3").next();g.tx().commit();
tx 0x705eafda280e already locked key-column (  8-  0-  0-  0-  0-  0-  0-128, 80-160) when tx 0x70629e1d56bf tried to lock

之前我查看了 titan-dynamodb 源代码,发现记录了 commits/rollbacks 等事务,因此我更改了日志级别以获取更多信息(完整的日志文件可用)。

执行 0.0.1 -> 0.0.2 更新时,获取了以下锁:

[33mtitan_server_1  |[0m 120479 [gremlin-server-exec-3] TRACE com.amazon.titan.diskstorage.dynamodb.AbstractDynamoDBStore  - acquiring lock on (  8-  0-  0-  0-  0-  0-  0-128, 80-160) at 123552624951495
[33mtitan_server_1  |[0m 120489 [gremlin-server-exec-3] TRACE com.amazon.titan.diskstorage.dynamodb.AbstractDynamoDBStore  - acquiring lock on (  6-137-160- 48- 46- 48- 46-177,  0) at 123552635424334
[33mtitan_server_1  |[0m 120489 [gremlin-server-exec-3] TRACE com.amazon.titan.diskstorage.dynamodb.AbstractDynamoDBStore  - acquiring lock on (  6-137-160- 48- 46- 48- 46-178,  0) at 123552635704705

提交该事务后,仅释放了两个锁。

[33mtitan_server_1  |[0m 120722 [gremlin-server-exec-3] DEBUG com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreTransaction  - commit id:0x705eafda280e
[33mtitan_server_1  |[0m 120722 [gremlin-server-exec-3] TRACE com.amazon.titan.diskstorage.dynamodb.AbstractDynamoDBStore  - Expiring (  6-137-160- 48- 46- 48- 46-177,  0) in tx 0x705eafda280e because of EXPLICIT
[33mtitan_server_1  |[0m 120722 [gremlin-server-exec-3] TRACE com.amazon.titan.diskstorage.dynamodb.AbstractDynamoDBStore  - Expiring (  6-137-160- 48- 46- 48- 46-178,  0) in tx 0x705eafda280e because of EXPLICIT
[33mtitan_server_1  |[0m 120722 [gremlin-server-exec-3] DEBUG org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor  - Preparing to iterate results from - RequestMessage{, requestId=09f27811-dcc3-4e53-a749-22828d34997f, op='eval', processor='', args={gremlin=g.V().hasLabel("databaseMetadata").has("version", "0.0.1").property("version", "0.0.2").next();g.tx().commit();, batchSize=64}} - in thread [gremlin-server-exec-3]

剩余的锁会在几分钟后过期,但与此同时,如报告的那样,所有其他更新都会失败。

那么,为什么那个锁没有被移除?我怀疑它与创建的唯一索引有关,所以我要么设置了错误的索引(很有可能),要么这是一个错误。

为了便于使用,(稍微缩短的)索引设置如下:

mgmt = graph.openManagement()
propertyKey = (!mgmt.containsPropertyKey("version")) ? mgmt.makePropertyKey("version").dataType(String.class).cardinality(Cardinality.SINGLE).make():mgmt.getPropertyKey("version")
labelObj = (!mgmt.containsVertexLabel("databaseMetadata")) ? mgmt.makeVertexLabel("databaseMetadata").make():mgmt.getVertexLabel("databaseMetadata")
index = mgmt.buildIndex("bydatabaseMetadataversion", Vertex.class).addKey(propertyKey).unique().indexOnly(labelObj).buildCompositeIndex()
mgmt.setConsistency(propertyKey, ConsistencyModifier.LOCK)
mgmt.setConsistency(index, ConsistencyModifier.LOCK)
mgmt.commit()
mgmt = graph.openManagement()
index = mgmt.getGraphIndex("bydatabaseMetadataversion")
propertyKey = mgmt.getPropertyKey("version")
if (index.getIndexStatus(propertyKey) == SchemaStatus.INSTALLED) {
  mgmt.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.REGISTERED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call()
}
mgmt.commit()
mgmt = graph.openManagement()
index = mgmt.getGraphIndex("bydatabaseMetadataversion")
propertyKey = mgmt.getPropertyKey("version")
if (index.getIndexStatus(propertyKey) != SchemaStatus.ENABLED) {
  mgmt.commit()
  mgmt = graph.openManagement()
  mgmt.updateIndex(mgmt.getGraphIndex("bydatabaseMetadataversion"), SchemaAction.ENABLE_INDEX).get()
  mgmt.commit()
  mgmt = graph.openManagement()
  mgmt.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.ENABLED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call()
}
mgmt.commit()

我知道这是一个冗长的问题描述,但如有任何帮助,我们将不胜感激!

(我还应该说,我针对本地和基于云的 dynamoDb 实例进行了尝试,并且在两者上都遇到了同样的问题,所以回到本地并打开了日志记录。)

我正在使用 dynamo-titan on github 中设置的 titan 1.0.0 和 tinkerpop 3。

F.y.i。我在 Java 中使用 Berkeley 存储后端 运行 您的所有上述代码。

TitanGraph graph = ...;
TitanManagement mgmt = graph.openManagement();
PropertyKey propertyKey = (!mgmt.containsPropertyKey("version"))
        ? mgmt.makePropertyKey("version").dataType(String.class).cardinality(Cardinality.SINGLE).make()
        : mgmt.getPropertyKey("version");
VertexLabel labelObj = (!mgmt.containsVertexLabel("databaseMetadata"))
        ? mgmt.makeVertexLabel("databaseMetadata").make() 
        : mgmt.getVertexLabel("databaseMetadata");
TitanGraphIndex index = mgmt.buildIndex("bydatabaseMetadataversion", Vertex.class).addKey(propertyKey).unique()
        .indexOnly(labelObj).buildCompositeIndex();
mgmt.setConsistency(propertyKey, ConsistencyModifier.LOCK);
mgmt.setConsistency(index, ConsistencyModifier.LOCK);
mgmt.commit();
mgmt = graph.openManagement();
index = mgmt.getGraphIndex("bydatabaseMetadataversion");
propertyKey = mgmt.getPropertyKey("version");
if (index.getIndexStatus(propertyKey) == SchemaStatus.INSTALLED) {
    try {
        ManagementSystem.awaitGraphIndexStatus(graph,"bydatabaseMetadataversion").status(SchemaStatus.REGISTERED).timeout(10, java.time.temporal.ChronoUnit.MINUTES).call();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}
mgmt.commit();
mgmt = graph.openManagement();
index = mgmt.getGraphIndex("bydatabaseMetadataversion");
propertyKey = mgmt.getPropertyKey("version");
if (index.getIndexStatus(propertyKey) != SchemaStatus.ENABLED) {
    mgmt.commit();
    mgmt = graph.openManagement();
    try {
        mgmt.updateIndex(mgmt.getGraphIndex("bydatabaseMetadataversion"), SchemaAction.ENABLE_INDEX).get();
    } catch (InterruptedException | ExecutionException e) {
        e.printStackTrace();
    }
    mgmt.commit();
    mgmt = graph.openManagement();
    try {
        ManagementSystem.awaitGraphIndexStatus(graph, "bydatabaseMetadataversion").status(SchemaStatus.ENABLED)
                        .timeout(10, java.time.temporal.ChronoUnit.MINUTES).call();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}
mgmt.commit();

然后是图上的操作;

GraphTraversalSource g = graph.traversal();

graph.addVertex("databaseMetadata").property("version", "0.0.1");
graph.tx().commit();

g.V().hasLabel("databaseMetadata").has("version", "0.0.1").property("version", "0.0.2").iterate();
g.tx().commit();

g.V().hasLabel("databaseMetadata").has("version", "0.0.1").property("version", "0.0.2").iterate();
g.tx().commit();

g.V().hasLabel("databaseMetadata").has("version", "0.0.2").property("version", "0.0.3").iterate();
g.tx().commit();

g.V().hasLabel("databaseMetadata").has("version").properties("version").forEachRemaining(prop -> {
    System.out.println("Version: " + prop.value());
});

结果是:

Version: 0.0.3

遗憾的是,您对查询的 iterate() 更改仅适用于 Java。您的脚本应该按原样运行。根据我的实验结果,我强烈怀疑是 DynamoDB 后端出了问题。

我得到了重现并发现了你的问题。基本上,LRU 缓存从 storage.lock.expiry-time 配置中提取其到期时间。默认为 5 分钟,因此如果您尝试在 5 分钟之前进行更改,是的,AbstractDynamoDBStore.keyColumnLocalLocks LRU 缓存不会让您进行第二次更改。通过在进行第二次更改之前减少到期时间和 Thread.sleep(),您允许第二次更改再次申请锁并成功。

//default lock expiry time is 300*1000 ms = 5 minutes. Set to 100ms.
config.setProperty("storage.lock.expiry-time", 100);