Solr 增量导入擦除索引

Solr delta-import erases index

我在从 MySQL 数据库导入 Solr delta 时遇到问题。我能够完全导入没问题。当我尝试执行增量导入时,它会导入更改的记录(如预期的那样),但会清除索引的其余部分,因此只有更新的记录在索引中。日志中没有错误。我的配置中是否缺少某些内容? 运行 Ubuntu 服务器上的 Solr 5.4 并使用管理员 UI。

<dataConfig>
    <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/ibnet" user="xxxx" password="xxxxx" />
    <document>
    <entity name="profile" pk="profile.id" query="
        SELECT 
            profile.id AS id,
            profile.profile_status AS profile_status,
            //
            // Other fields
            //
            linkedProfile.org_name AS linked_org_name,
            linkedProfile.org_city AS linked_org_city,
            linkedProfile.org_st_prov_reg AS linked_org_st_prov_reg,
            linkedProfile.org_country AS linked_org_country
        FROM profile AS profile
        LEFT JOIN profile AS linkedProfile ON linkedProfile.id = profile.linked_id" 
        deltaImportQuery="
            SELECT 
                profile.id AS id,
                profile.profile_status AS profile_status,
                //
                // Other fields
                //
                linkedProfile.org_name AS linked_org_name,
                linkedProfile.org_city AS linked_org_city,
                linkedProfile.org_st_prov_reg AS linked_org_st_prov_reg,
                linkedProfile.org_country AS linked_org_country
            FROM profile AS profile
            LEFT JOIN profile AS linkedProfile ON linkedProfile.id = profile.linked_id
            WHERE profile.id = '${dih.delta.id}'"
        deltaQuery="SELECT profile.id FROM profile WHERE last_modified > '${dih.last_index_time}'"
        onError="skip" >
    </entity>
</document>

编辑:我已将 dih.delta.id 更改为 dataimporter.delta.id,last_index_time 也是如此,但这并没有改变结果。

这是回复:

{
  "responseHeader": {
    "status": 0,
    "QTime": 0
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "data-config.xml"
    ]
  ],
  "command": "status",
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "4",
    "Total Rows Fetched": "6",
    "Total Documents Processed": "3",
    "Total Documents Skipped": "0",
    "Delta Dump started": "2016-05-01 02:38:03",
    "Identifying Delta": "2016-05-01 02:38:03",
    "Deltas Obtained": "2016-05-01 02:38:03",
    "Building documents": "2016-05-01 02:38:03",
    "Total Changed Documents": "3",
    "": "Indexing completed. Added/Updated: 3 documents. Deleted 0 documents.",
    "Committed": "2016-05-01 02:38:03",
    "Time taken": "0:0:0.317"
  }
}

在 solr admin -> your core -> dataimport 中,有一个 Clean 选项,如果选中则它会在导入之前先清理数据(对于完全导入和增量导入)。

另一个提示是,solr DIH 总是使用 UTC 作为导入时间戳,那么你的时区是什么?先将数据库中的日期时间列转换为 utc,然后再与 dih.last_index_time.

进行比较