运行 gremlin 与 JAVA 异步查询时出现 OOM 错误

OOM error when running gremlin queries asynchronously with JAVA

我们创建了一个 rest API,它在 Janus 图上执行 gremlin 查询,returns 结果采用 JSON 格式。 API 用于小型结果集的工作文件。但是对于大型结果集,当我们异步命中 API 时,会出现以下错误,(最大堆大小 -Xmx4g

java.lang.OutOfMemoryError: GC overhead limit exceeded

我正在使用带有 & 的 curl 来异步命中 API,

curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &

连接到 janus 图的代码,

cluster = Cluster.open(config);
connect = cluster.connect();

submit = connect.submit(gremlin);
Iterator<Result> resultIterator = submit.iterator();
int count=0;
while (resultIterator.hasNext()){
    //add to list, commented to check OOM error
}

配置,

config.setProperty("connectionPool.maxContentLength", "50000000");
config.setProperty("connectionPool.maxInProcessPerConnection", "30");
config.setProperty("connectionPool.maxInProcessPerConnection", "30");
config.setProperty("connectionPool.maxSize", "30");
config.setProperty("connectionPool.minSize", "1");
config.setProperty("connectionPool.resultIterationBatchSize", "200");

Gremlin 驱动程序,

org.apache.tinkerpop.gremlin-driver:3.4.6

如何像游标一样处理大结果集,以便不是所有数据都加载到内存中?
我缺少任何配置吗?非常感谢任何帮助。

Gremlin 查询:

g.withSack(0).V().hasLabel(%27material%27).has(%27dim_batchid%27,within(5028245,5080395,5366265,5159380,4872924,5093856,5216023,5068771,5093820,5154387,4703406,4872835,5214752,4893085,4866319,4556751,5342365,5075448,5074467,4835525,4987972,5347712,4986643,5204689,4755232,5076490,5028246,4922387,4659627,4597456,4743346,5080956,5370167,5260125,5134845,4613324,4720631,4937766,5356972,5148510,5210986,4930135,4984021,4720172,5028031,4836893,5068621,5333830,5020806,5081693,4988567,4869467,4709219,4958246,5021639,4607913,4923487,4614485,5066054,4869093,5339365,5204715,4980349,5215913,5342616,4959705,4959549,4929369,5022805,4920163,5204563,5027627,5208788,4712451,4862298,5019103,4982159,4727160,5395618,4924536,5390450,4943986,5071744,5208844,4898192,5347546,5204875,4710474,4794222,4962808,5269053,4836267,4602886,5359126,5393203,4780380,5148475,5092749,5351705,5339311,4601782,4869039,5366475,4959070,4963475,5346888,4923494,5279816,5297980,5154181,5030501,5142954,5392329,4839306,4890656,5134911,4893104,4989444,5069672,4961009,5027559,5029007,5285813,4820025,5287707,4959634,5148474,5362926,5362211,4557278,5353486,4933573,4785560,4890658,4930937,4553089,5030503,5341503,4783801,5068529,4821152,5208845,4766406,5043752,4770709,4733416,5204713,4815450,4981053,4963427,4980830,5340154,4771353,5204561,4920161,4794149,5275867,5021788,5364102,5205411,5356459,4794233,4923438,4610509,5392350,4746342,5022804,4936411,5361555,4890888,4980829,4959869,4869092,4891157,4815449,5267434,4836975,4684010,5281322,5071746,4711290,5289333,5021638,5299283,5210803,5348731,5068491,4776862,5196532,4766677,4930133,5210984,4608878,5261295,4826630,4786051,4779996,4930134,5020804,4766678,4869064,5286802,4545299,4693065,4930844,4816538,4888415,4711706,4923002,4780402,5044968,5148437,4753993,5074466,4890805,5074558,5076491,4547035,5092021,5262308,5205445,5213382,5159381,5263280,5351407,4890706,4659738,5344469,5075928,4613336,5065866,4863764,5217111,4792255,5210914,5204691,4890806,5148438,4986897,4817686,4712337,5196528,5280266,4929327,5134843,5393007,5019151,4923482,4763007,4929395)).emit().repeat(sack(sum).by(constant(1)).inE().outV()).project(%27level%27,%27properties%27).by(sack()).by(tree().by(valueMap().by(fold().unfold())).by(valueMap().by(fold())))

从分析中可以看出,问题很明显是 gremlin 驱动程序引起的,但我不确定如何修复它并释放内存。

此外,线程进入冻结状态超过 5 分钟,

我认为您可能 运行 陷入这个问题 TINKERPOP-2424. Basically the queue that holds the incoming results is filling faster than you can consume the results and you blow the heap. You can see that there is a patch there that in the issue that seems to solve the problem but I'm not convinced that it's the best solution just yet so it hasn't be implemented. If you have suggestions for how to resolve the problem, please feel free to comment on the ticket. If that is not the issue you are facing I think you'd have to provide a way to replicate your problem or do some profiling to isolate your issue further. Perhaps it would be good to do some profiling anyway as you should be able to prove that TINKERPOP-2424 is your problem that way. If you have a look at the mailing list link 因为 post 您应该看到验证问题的方法。