Endeca - 基线更新失败 - 从记录存储读取错误

Endeca - Baseline update failed - Error reading from Record Store

ATG - Endeca Baseline update is getting failed with the following error in ART apps. But partial indexing is successful.

附加相应错误的 CAS 日志。

Aug 15, 2017 12:23:37 PM com.endeca.soleng.eac.toolkit.script.Script runBeanShellScript
SEVERE: Crawl 'ART-last-mile-crawl' failed with error: Problem running full acquisition on data source for ART-last-mile-crawl: Error reading from Record Store ART-data: malformed input around byte 10.
Occurred while executing line 11 of valid BeanShell script:
[[

 8|      Dgidx.cleanDirs();
 9|
10|      // run crawl and archive any changes in dvalId mappings
11|      CAS.runBaselineCasCrawl("ART-last-mile-crawl");
12|      CAS.archiveDvalIdMappingsForCrawlIfChanged("ART-last-mile-crawl");
13|
14|      // archive logs and run the indexer

]]

Aug 15, 2017 12:23:37 PM com.endeca.soleng.eac.toolkit.Controller execute
SEVERE: Caught an exception while invoking method 'run' on object 'BaselineUpdate'. Releasing locks.
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

In CAS the error logs are:

2017-08-15 12:23:36,485 ERROR [ART-data] [cas-ART-last-mile-crawl-worker-1] com.endeca.itl.recordstore.impl.RecordStoreImpl: Error executing method RecordStoreImpl.readRecords()
com.endeca.itl.recordstore.RecordStoreException: malformed input around byte 10
        at com.endeca.itl.recordstore.impl.ReadCursor.read(ReadCursor.java:81)
        at com.endeca.itl.recordstore.impl.RecordStoreImpl.readRecords(RecordStoreImpl.java:480)
        at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at com.endeca.itl.service.ServicePublisher.invoke(ServicePublisher.java:121)
        at com.sun.proxy.$Proxy57.readRecords(Unknown Source)
        at com.endeca.itl.recordstore.RecordStoreReader.fetchNextChunk(RecordStoreReader.java:267)
        at com.endeca.itl.recordstore.RecordStoreReader.hasNext(RecordStoreReader.java:244)
        at com.endeca.itl.extension.source.merger.RecordStoreMergerDataSourceRuntime$RecordStoreReadSession.runFull(RecordStoreMergerDataSourceRuntime.java:252)
        at com.endeca.itl.extension.source.merger.RecordStoreMergerDataSourceRuntime.runFullAcquisition(RecordStoreMergerDataSourceRuntime.java:148)
        at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime.doWork(CasExtensionRegistry.java:220)
        at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime.doWork(CasExtensionRegistry.java:218)
        at com.endeca.itl.plugin.ThreadContextRunner.run(ThreadContextRunner.java:136)
        at com.endeca.itl.plugin.ThreadContextRunner.run(ThreadContextRunner.java:89)
        at com.endeca.itl.util.CasExtensionRegistry$ContextClassLoaderDataSourceExtensionRuntime.runFullAcquisition(CasExtensionRegistry.java:218)
        at com.endeca.itl.executor.extension.ExtensionDataSourceProcessor.processRecord(ExtensionDataSourceProcessor.java:104)
        at com.endeca.itl.executor.extension.IncrementalDataSourceProcessor.processRecord(IncrementalDataSourceProcessor.java:106)
        at com.endeca.itl.executor.TaskManager.work(TaskManager.java:166)
        at com.endeca.itl.executor.WorkExecutor$WorkRunnable.run(WorkExecutor.java:194)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
        at com.endeca.itl.util.LoggingContextAwareThread.run(LoggingContextAwareThread.java:71)
Caused by: java.io.UTFDataFormatException: malformed input around byte 10
        at java.io.DataInputStream.readUTF(DataInputStream.java:656)
        at java.io.DataInputStream.readUTF(DataInputStream.java:564)
        at com.endeca.itl.recordstore.impl.storage.RecordStorageEntry.load(RecordStorageEntry.java:114)

关于如何进一步分类和调试的任何输入都会有所帮助

完成以下步骤后,上述错误已解决。

  1. 步骤 1 - 添加以下更改后导出和导入 CAS 配置。 <ignoreInvalidRecords>true</ignoreInvalidRecords>

    recordstore-cmd.sh get-configuration -a ART-data -f dataConfig.xml
    recordstore-cmd.sh set-configuration -a ART-data -f dataConfig.xml
    
  2. 第 2 步 - 在 <CAS_WS>/workspace/state/ART-data

    下的记录存储配置文件中进行了相同的更改
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <recordStoreConfiguration xmlns="http://recordstore.itl.endeca.com/">
        <changePropertyNames/>
        <idPropertyName>record.id</idPropertyName>
        <ignoreInvalidRecords>true</ignoreInvalidRecords><!-- newly added -->
        <jdbmSettings/>
    </recordStoreConfiguration>
    
  3. 第 3 步 - cas_output 文件夹 <Endeca_apps>/ART/data/cas_output 已替换为可用的备份(2 天前)。

经过上述步骤,索引是直接从后端启动的(通过调用脚本)。一旦这个索引成功,就从发电机调用索引并且同样成功。