Hibernate Search:如何通过两个 Spring(JVM) 进程更新索引?

Hibernate Search : How to update an index by two Spring(JVM) processes?

我的应用程序在两个不同的 JVM 进程中有两个组件 运行。

1) Spring 开机休息API

2) Spring 引导批处理(处理 API 提交的作业)

我将 "Hibernate Search" 与 Spring 一起使用,这两个组件都需要更新搜索索引。

然而,似乎首先启动的 JVM 进程获得了锁,当其他组件尝试更新索引时,它抛出以下异常。

我怎样才能使这两个 JVM 进程都更新索引而不会出现此锁定问题?

2017-05-22 02:33:56.795 ERROR 14701 --- [del.FeatureMeta] o.h.s.exception.impl.LogErrorHandler     : HSEARCH000058: Exception occurred org.apache.lucene.store.LockObtainFailedException: Lock held by another program: /home/bisuser/cdna-meta-index/default/com.company.dsd.cdna.repository.model.FeatureMeta/write.lock
Primary Failure:
    Entity com.company.dsd.cdna.repository.model.FeatureMeta  Id 169  Work Type  org.hibernate.search.backend.UpdateLuceneWork


org.apache.lucene.store.LockObtainFailedException: Lock held by another program: /home/bisuser/cdna-meta-index/default/com.company.dsd.cdna.repository.model.FeatureMeta/write.lock
    at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:118) ~[lucene-core-5.5.4.jar!/:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
    at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) ~[lucene-core-5.5.4.jar!/:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
    at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) ~[lucene-core-5.5.4.jar!/:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:776) ~[lucene-core-5.5.4.jar!/:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
    at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:126) ~[hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:92) ~[hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117) ~[hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203) ~[hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:81) [hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46) [hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:165) [hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:151) [hibernate-search-engine-5.6.1.Final.jar!/:5.6.1.Final]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_112]

为了让 Lucene 提供良好的性能,独占锁是必要的。您可以禁用它,但需要付出代价。

原因是仍然会有锁定机制,尽管锁定只会在很短的时间内使用。这意味着不会并行写入索引,并且其他 JVM 进程可能会等待尝试获取锁(性能可能很差)。更糟糕的是,这些获取锁的尝试可能会失败:它们可能会超时。鉴于锁队列不公平(最后一个请求锁的 JVM 进程可能是第一个获得它的),并且超时处理相当粗糙(尝试一次,然后再尝试 2 秒后,然后超时)这里有很高的失败风险。

因此,如果您确定不会对锁有任何强烈争用(例如,仅 运行 在夜间或周末进行大量索引作业),则禁用独占锁定可能成为一种选择;否则,您可能会因锁定超时而导致严重滞后 and/or 失败。 请参阅 the documentation 中的 hibernate.search.[default|<indexname>].exclusive_index_use

或者:

  • 您可以考虑是否真的需要在不同的 JVM 中进行这两件事(REST API 和批处理)(但我猜您确实需要)
  • 你可以看看JGroups/JMS alternative architectures for Hibernate Search. However, those are admittedly harder to configure, and you should be aware that dynamic sharding won't work well
  • 如果您需要 JVM 运行 在单独的服务器上,并且如果水平扩展(添加更多应用程序服务器)在您的特定情况下很重要,您可以查看 the experimental Elasticsearch integration .