使用Nifi在csv文件中包含图像的base64代码

Question

我有来自 InvokeHTTP 的 json 数组响应。我正在使用以下流程将一些 json 信息转换为 csv。 json 信息之一是 id，用于获取图像，然后将其转换为 base64。我需要将此 base64 代码添加到我的 csv 中。我不明白如何将它保存在一个属性中，以便它可以放在 AttributeToCsv 中。

此外，我正在阅读这里 https://community.cloudera.com/t5/Support-Questions/Nifi-attribute-containing-large-text-value/td-p/190513 由于内存问题，不建议在属性中存储大值。在这种情况下，最佳方法是什么。

Json 第一次调用时的响应：

[ {
  "fileNumber" : "1",
   "uuid" : "abc",
  "attachedFiles" : [ {
    "id" : "bkjdbkjdsf",
    "name" : "image1.png",
  }, {
    "id" : "xzcv",
    "name" : "image2.png",
  } ],
  "date":null
  },
  { "fileNumber" : "2",
   "uuid" : "def",
  "attachedFiles" : [],
  "date":null
  }]

最终 Csv（合并后或预期输出）：

Id,File Name, File Data(base64 code)
bkjdbkjdsf,image1.png, iVBORw0KGgo...ji
xzcv,image1.png,ZEStWRGau..74

我的方法（会根据建议改变）：拆分 Json 响应后，我使用 EvaluateJsonPath 获取“attachedFiles”。我找到数组“attachedFiles”的长度，然后决定如果有 2 个或更多文件是否需要进一步拆分。如果为 0 则什么都不做。在第二个 EvaluateJsonPath 中，我添加属性 Id、文件名并使用 $.id 等设置 json 的值。我使用 Id 调用我编码的其他 URL到 Base64。

当前输出 - 需要使用第三列文件数据（base64 代码）更新的 csv 文件及其值：

Id,File Name
bkjdbkjdsf,image1.png
xzcv,image1.png

Answer 1

作为变体使用 ExecuteGroovyScript:

def ff=session.get()
if(!ff)return

ff.write{sin, sout->
    sout.withWriter('UTF-8'){w->
        //write attribute values for names 'Id' and 'filename' delimited with coma
        w << ff.attributes.with{a->[a.'Id', a.'filaname']}.join(',')
        w << ',' //wtite coma
        
        //sin.withReader('UTF-8'){r-> w << r} //write current content of the file after last coma
        w << sin.bytes.encodeBase64()
        w << '\n'
    }
}
REL_SUCCESS << ff

UPD: 我输入 sin.bytes.encodeBase64() 而不是复制流文件内容。这个为输入文件创建单行 base64 字符串。如果您正在使用此选项 - 您应该删除 Base64EncodeContent 以防止双重 base64 编码。

使用Nifi在csv文件中包含图像的base64代码

Include base64 code of image in csv file using Nifi

base64

apache-nifi