在 Kettle 中使用发票行展平发票抬头

Flatten Invoice Header with Invoice Lines in Kettle

如果您有一个包含多个值(发票编号、日期、位置)的发票抬头和一个包含多个值(产品、价格、税)的未知数量的发票行,有没有办法将这些数据扁平化为一个在发票行数量因发票而异的情况下延伸的行?


输入示例-

{"InvoiceRecords": [{
    "InvoiceDate": "8/9/2017 12:00:00 AM",
    "InvoiceLocation": "002",
    "InvoiceNumber": "2004085",
    "InvoiceRecordHeaderDetails": [{
        "InvNum": "2004085",
        "Location": "002",
        "InvDate": "8/9/2017 12:00:00 AM"
    }],
    "InvoiceRecordLineItemDetails": [{
        "UniqueID": "3939934",
        "InvNum": "2004085",
        "LINEITEM": "1",
        "CUSTID": "PREAA",
        "DEPTID": "320306",
        "PRODID": "088856",
        "ProdDesc": "STATE UST",
        "Unitprice": "0.003",
        "QuantShare": "237.5",
        "TaxRate": "7.25",
        "taxamount": "0.05"
    }],
    "InvoiceTaxCodeDetails": [{
        "InvNum": "2004085",
        "LineItem": "1",
        "UniqueID": "34",
        "taxCode": "SALES TAX",
        "taxrate": "7.25",
        "maxtax": "0"
    }]
}]}

我需要同一行中的所有项目(允许在给定的发票记录上有多个行项目 and/or 多个税码项目。


输出示例(注意:下面的“_n”指的是未确定数量的发票行和可能的税行):

{"InvoiceRecords": [{
    "InvoiceDate": "8/9/2017 12:00:00 AM",
    "InvoiceLocation": "002",
    "InvoiceNumber": "2004085",
    "InvoiceRecordHeaderDetailsInvNum": "2004085",
    "InvoiceRecordHeaderDetailsInvNumLocation": "002",
    "InvoiceRecordHeaderDetailsInvNumInvDate": "8/9/2017 12:00:00 AM",
    "InvoiceRecordLineItemDetailsUniqueID_1": "3939934",
    "InvoiceRecordLineItemDetailsInvNum_1": "2004085",
    "InvoiceRecordLineItemDetailsLINEITEM_1": "1",
    "InvoiceRecordLineItemDetailsCUSTID_1": "PREAA",
    "InvoiceRecordLineItemDetailsDEPTID_1": "320306",
    "InvoiceRecordLineItemDetailsPRODID_1": "088856",
    "InvoiceRecordLineItemDetailsProdDesc_1": "STATE UST",
    "InvoiceRecordLineItemDetailsUnitprice_1": "0.003",
    "InvoiceRecordLineItemDetailsQuantShare_1": "237.5",
    "InvoiceRecordLineItemDetailsTaxRate_1": "7.25",
    "InvoiceRecordLineItemDetailstaxamount_1": "0.05",
    "InvoiceTaxCodeDetailsInvNum_1": "2004085",
    "InvoiceTaxCodeDetailsLineItem_1": "1",
    "InvoiceTaxCodeDetailsUniqueID_1": "34",
    "InvoiceTaxCodeDetailstaxCode_1": "SALES TAX",
    "InvoiceTaxCodeDetailstaxrate_1": "7.25",
    "InvoiceTaxCodeDetailsmaxtax_1": "0",
    "InvoiceRecordLineItemDetailsUniqueID_n": "3939934",
    "InvoiceRecordLineItemDetailsInvNum_n": "2004085",
    "InvoiceRecordLineItemDetailsLINEITEM_n": "1",
    "InvoiceRecordLineItemDetailsCUSTID_n": "PREAA",
    "InvoiceRecordLineItemDetailsDEPTID_n": "320306",
    "InvoiceRecordLineItemDetailsPRODID_n": "088856",
    "InvoiceRecordLineItemDetailsProdDesc_n": "STATE UST",
    "InvoiceRecordLineItemDetailsUnitprice_n": "0.003",
    "InvoiceRecordLineItemDetailsQuantShare_n": "237.5",
    "InvoiceRecordLineItemDetailsTaxRate_n": "7.25",
    "InvoiceRecordLineItemDetailstaxamount_n": "0.05",
    "InvoiceTaxCodeDetailsInvNum_n": "2004085",
    "InvoiceTaxCodeDetailsLineItem_n": "1",
    "InvoiceTaxCodeDetailsUniqueID_n": "34",
    "InvoiceTaxCodeDetailstaxCode_n": "SALES TAX",
    "InvoiceTaxCodeDetailstaxrate_n": "7.25",
    "InvoiceTaxCodeDetailsmaxtax_n": "0"
}]}

谢谢!

您 spoon.bat 附近的 samples 目录中有一个类似问题的示例。看看 samples/transformation/XML Add 并在第一个巧克力中幸存下来:他们做的事情要复杂得多,只是为了展示所有可能的东西。

在您的情况下,将 Switch/Case 中的输入流拆分为 header 中的项目,并设法在每个项目上保留 InvoiceNumber(稍后会详细介绍)。将三个流转换为 JSON(使用 JSON 输出,或者使用 Javascript 可能更容易)。然后您 Group by 按 InvoiceNumber 列出的项目。通过 InvoiceNumber 加入三个流,为此我建议在 header 流中使用 lookup stream,然后在页脚流中使用另一个 lookup stream。使用另一个 javascript 并将数据视为字符串,您可以构建格式为 { header、[item]、footer} 的 JSON 行,您可以 Group by连接只有一行。

一些工作,但相当标准,除了在项目和页脚上获取 InvoiceNumber 的棘手部分,因为它们已从流程中消失。为此,您可以使用 javascript 保留值的事实,除非重新定义。添加一个新的启动脚本【右击tab顶部的Script1,添加一个副本,右击刚刚创建的Script1_0,定义为Start script】。

在此启动脚本中:

var PrevInvoiceNumber = -1;

关于主脚本:

if(InvoiceNumber && PrevInvoiceNumber!=InvoiceNumber)
    PrevInvoiceNumber = InvoiceNumber

然后您应该会在每一行看到 PrevInvoiceNumber 等于发票的预期 InvoiceNumber 的数据。