如何将 txt 和 pdf 从 Apache-NiFi 中的解压内容文件处理器中分离出来，并留在队列中？

Question

我正在尝试解压缩一个文件（其中包含 2 种文件格式，一种 PDF 和另一种 txt）。我能够解压缩它，现在需要将它分开并将其保存在 2 个单独的队列中，一个将保存 txt，另一个将保存 PDF。如果有人可以帮助我如何走这条路。

Answer 1

您需要 RouteOnAttribute 处理器。

向其中添加 2 个属性：

match_txt = ${filename:toLower():endsWith('.txt')}
match_pdf = ${filename:toLower():endsWith('.pdf')}

因此该处理器将有 2 个输出关系：match_txt 和 match_pdf

How can I separate txt and pdf from unpack content file processor in Apache-NiFi, and leave in Queue?