如何让Logstash替换旧数据？

How to make Logstash replace old data?

我有一个 Oracle 数据库。 Logstash 从 Oracle 检索数据并将其放入 ElasticSearch。

但是当Logstash每5分钟进行一次计划导出时，ElasticSearch充满副本导致旧数据仍然存在。这是一个明显的情况。甲骨文的状态在这5分钟内几乎没有变化。假设 - 添加了 2-3 行，删除了 4-5 行。

如何在没有副本的情况下用新数据替换旧数据？

例如：

Delete the whole old index;

Create new index with the same name and make the same configuration (nGram configuration and mapping);

Add all new data;

Wait for 5 minutes and repeat.

这很简单：为每次导入创建一个新索引并应用映射，之后将您的别名切换到最新的索引。如果需要，删除旧索引。在索引最新数据时，您的当前数据将始终可搜索。

以下是您可能需要阅读的资源：

在 elasticsearch 中搜索时使用别名 (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html) 指向最新数据（顺便说一句，使用别名总是一个好主意）。
使用翻转 api (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-rollover-index.html) 为每个导入创建一个新索引运行 - 注意这里的别名处理。
使用索引模板 (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html) 为您新创建的索引自动应用 mappings/settings。
收缩，关闭 and/or 删除旧索引，让您的集群处理您真正需要的数据。查看策展人 (https://github.com/elastic/curator) 作为独立工具。

您只需要使用每个文档的 fingerprint/hash 或每个文档中 uniq 字段的散列作为文档 ID ，这样您就可以随时用更新的文档覆盖相同的文档，同时添加新文档。

但是这种方法不适用于从 oracle 中删除数据。

如何让Logstash替换旧数据？

How to make Logstash replace old data?

n-gram

elasticsearch

logstash