在 Nifi 和 QueryRecord 处理器中，我们可以添加一个新列，它是另一列的正则表达式

Question

在 Nifi 和 QueryRecord 处理器中，我们可以添加一个新列，它是查询中另一列的正则表达式吗？

喜欢：SELECT info, SUBSTRING(info, "([^\s]+)") as f_name FROM FLOWFILE

我不想拆分流文件、ExtractText、UpdateAttributes，然后是 attributesToJson 和 MergeContent。似乎是循环，如果我们谈论每个 FlowFile 400MB，每个 FlowFile 有 100k+ 行，这将需要时间

输入：

{"info":"Rachel: %Robert-100-400-4444-Mrs"}
{"info":": %Martin-200-500-5555-Mr"}
{"info":"%Holand-300-600-6666-Mr"}

期望的输出：

{"info":"Rachel: %Robert-100-400-4444-Mrs", "f_name":"Rachel","l_name":"Robert","area_code":100,"last_four_digit":4444,"title":"Mrs"}
{"info":": %Martin-200-500-5555-Mr", "f_name":"","l_name":"Martin","area_code":200,"last_four_digit":5555,"title":"Mr"}
{"info":"%Holand-300-600-6666-Mr", "f_name":"","l_name":"Holand","area_code":300,"last_four_digit":6666,"title":"Mr"}

Answer 1

QueryRecord 将允许您使用正则表达式通过 LIKE 过滤流文件中的记录（one of the examples 在文档的附加详细信息页面上），但是更新记录，你需要使用UpdateRecord.

UpdateRecord 使用 RecordPath DSL syntax, which also has regex functions like replaceRegex and matchesRegex.

在 Nifi 和 QueryRecord 处理器中，我们可以添加一个新列，它是另一列的正则表达式

in Nifi and with QueryRecord processor can we add a new column that is a regex of another column

json

processor

apache-nifi

apache-calcite