Jolt 变换嵌套分组
Jolt Transform Nested Grouping
我有一个 JSON 具有以下平面结构:
[{
"PK": "1111",
"SOURCE_DB": "Oracle",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Work",
"EMAIL": null
"PHONE_COUNTRY_CODE": "44",
"PHONE_NUMBER": "12345678",
"PHONE_EXT": "907643",
"STATUS": "Active"
}, {
"PK": "1111",
"SOURCE_DB": "Oracle",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Home",
"EMAIL": null
"PHONE_COUNTRY_CODE": "353",
"PHONE_NUMBER": "87654321",
"PHONE_EXT": null,
"STATUS": "Active"
}, {
"PK": "1111",
"SOURCE_DB": "",
"CONTACT_TYPE": "Email",
"CONTACT_SUBTYPE": "Personal",
"EMAIL": "me@mail.com"
"PHONE_COUNTRY_CODE": null,
"PHONE_NUMBER": null,
"PHONE_EXT": null,
"STATUS": "Active"
},
{
"PK": "2222",
"SOURCE_DB": "DB2",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Home",
"EMAIL": null
"PHONE_COUNTRY_CODE": "44",
"PHONE_NUMBER": "98761234",
"PHONE_EXT": null,
"STATUS": "Inactive"
}, {
"PK": "2222",
"SOURCE_DB": "DB2",
"CONTACT_TYPE": "Email",
"CONTACT_SUBTYPE": "Work",
"EMAIL": "you@mail.co.uk"
"PHONE_COUNTRY_CODE": null,
"PHONE_NUMBER": null,
"PHONE_EXT": null,
"STATUS": "Active"
}
]
然后,我想将它们分组,首先按 Key (PK),然后在每个条目中,ContactMethods 将分组在一起。这是输出:
{
"Accounts": [{
"Reference": {
"Key": "1111",
"System": "Oracle"
},
"ContactMethods": {
"Phone": [{
"Subtype": "Work",
"CountryCode": "44",
"Number": "12345678",
"Extension": "907643",
"Active": true
}, {
"Subtype": "Home",
"CountryCode": "353",
"Number": "87654321",
"Extension": null,
"Active": true
}
],
"Email": [{
"Subtype": "Personal",
"EmailAddress": "my@mail.com",
"Active": true
}
]
}
}, {
"Reference": {
"Key": "2222",
"System": "DB2"
},
"ContactMethods": {
"Phone": [{
"Subtype": "Home",
"CountryCode": "44",
"Number": "98761234",
"Extension": null,
"Active": false
}
],
"Email": [{
"Subtype": "Work",
"EmailAddress": "you@mail.co.uk",
"Active": true
}
]
}
}
]
}
我可以通过 PK 对其进行分组,但我在第二部分遇到困难,即如何在嵌套结构内进行分组。你能展示一个样本规格并做一些解释吗?
可能但确实令人费解/冗长。这正在突破 Jolt 应该做的事情的界限。
一个枢轴和一些重新映射是可以维护的,但这已经足够复杂了,如果出现问题/您的数据很奇怪,将很难调试。
需要 5 个步骤。两个用于将 STATUS 从字符串修复为布尔值。两个用于数据透视和子透视。最后一个把所有东西都放在正确的最后位置。
我建议检查每个步骤检查 Jolt 演示站点自己的 tab/copy 中的每个步骤,以查看/理解每个步骤在做什么。
规格
[
{
// ninja in a true and false value so that
// Status "Active" / "Inactive" can be "mapped" to booleans
"operation": "default",
"spec": {
"*": {
"FALSE": false,
"TRUE": true
}
}
},
{
// fix STATUS
"operation": "shift",
"spec": {
"*": {
//
"STATUS": {
// Match "Active" as make STATUS be true
"Active": {
"@(2,TRUE)": "[&3].STATUS"
},
// Everything else set to false
"*": {
"@(2,FALSE)": "[&3].STATUS"
}
},
// match and discard TRUE and FALSE
"TRUE|FALSE": null,
// pass everything else thru
"*": "[&1].&"
}
}
},
{
// now, group by PK value
"operation": "shift",
"spec": {
// top level array
"*": {
"PK": {
"*": { // match any value of PK
// go back up and grab the whole block and write
// it to the ouput where the key, is the value of PK
"@2": "&1[]"
}
}
}
}
},
{
// sub group by CONTACT_TYPE, with the complication of
// pulling one entry off to serve as the "Reference"
"operation": "shift",
"spec": {
"*": { // pk value
"0": { // special case the Zeroth item so that
// we can pull off once copy to serve as the
// Reference
"@": "&2.Reference",
// sub group by CONTACT_TYPE
"CONTACT_TYPE": {
"*": {
"@2": "&4.ContactMethods.&1[]"
}
}
},
"*": { // all the rest of the array indicies
// sub group by CONTACT_TYPE
"CONTACT_TYPE": {
"*": {
"@2": "&4.ContactMethods.&1[]"
}
}
}
}
}
},
{
// Data fixing and Grouping done, now put everything
// in its final place
"operation": "shift",
"spec": {
"*": { // top level pk
"Reference": {
"PK": "Accounts[#3].Reference.Key",
"SOURCE_DB": "Accounts[#3].Reference.System"
},
"ContactMethods": {
"Phone": {
"*": {
"CONTACT_SUBTYPE": "Accounts[#5].ContactMethods.Phone[&1].Subtype",
"PHONE_COUNTRY_CODE": "Accounts[#5].ContactMethods.Phone[&1].CountryCode",
"PHONE_NUMBER": "Accounts[#5].ContactMethods.Phone[&1].Number",
"PHONE_EXT": "Accounts[#5].ContactMethods.Phone[&1].Extension",
"STATUS": "Accounts[#5].ContactMethods.Phone[&1].Active"
}
},
"Email": {
"*": {
"CONTACT_SUBTYPE": "Accounts[#5].ContactMethods.Email[&1].Subtype",
"EMAIL": "Accounts[#5].ContactMethods.Email[&1].EmailAddress",
"STATUS": "Accounts[#5].ContactMethods.Email[&1].Active"
}
}
}
}
}
}
]
我有一个 JSON 具有以下平面结构:
[{
"PK": "1111",
"SOURCE_DB": "Oracle",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Work",
"EMAIL": null
"PHONE_COUNTRY_CODE": "44",
"PHONE_NUMBER": "12345678",
"PHONE_EXT": "907643",
"STATUS": "Active"
}, {
"PK": "1111",
"SOURCE_DB": "Oracle",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Home",
"EMAIL": null
"PHONE_COUNTRY_CODE": "353",
"PHONE_NUMBER": "87654321",
"PHONE_EXT": null,
"STATUS": "Active"
}, {
"PK": "1111",
"SOURCE_DB": "",
"CONTACT_TYPE": "Email",
"CONTACT_SUBTYPE": "Personal",
"EMAIL": "me@mail.com"
"PHONE_COUNTRY_CODE": null,
"PHONE_NUMBER": null,
"PHONE_EXT": null,
"STATUS": "Active"
},
{
"PK": "2222",
"SOURCE_DB": "DB2",
"CONTACT_TYPE": "Phone",
"CONTACT_SUBTYPE": "Home",
"EMAIL": null
"PHONE_COUNTRY_CODE": "44",
"PHONE_NUMBER": "98761234",
"PHONE_EXT": null,
"STATUS": "Inactive"
}, {
"PK": "2222",
"SOURCE_DB": "DB2",
"CONTACT_TYPE": "Email",
"CONTACT_SUBTYPE": "Work",
"EMAIL": "you@mail.co.uk"
"PHONE_COUNTRY_CODE": null,
"PHONE_NUMBER": null,
"PHONE_EXT": null,
"STATUS": "Active"
}
]
然后,我想将它们分组,首先按 Key (PK),然后在每个条目中,ContactMethods 将分组在一起。这是输出:
{
"Accounts": [{
"Reference": {
"Key": "1111",
"System": "Oracle"
},
"ContactMethods": {
"Phone": [{
"Subtype": "Work",
"CountryCode": "44",
"Number": "12345678",
"Extension": "907643",
"Active": true
}, {
"Subtype": "Home",
"CountryCode": "353",
"Number": "87654321",
"Extension": null,
"Active": true
}
],
"Email": [{
"Subtype": "Personal",
"EmailAddress": "my@mail.com",
"Active": true
}
]
}
}, {
"Reference": {
"Key": "2222",
"System": "DB2"
},
"ContactMethods": {
"Phone": [{
"Subtype": "Home",
"CountryCode": "44",
"Number": "98761234",
"Extension": null,
"Active": false
}
],
"Email": [{
"Subtype": "Work",
"EmailAddress": "you@mail.co.uk",
"Active": true
}
]
}
}
]
}
我可以通过 PK 对其进行分组,但我在第二部分遇到困难,即如何在嵌套结构内进行分组。你能展示一个样本规格并做一些解释吗?
可能但确实令人费解/冗长。这正在突破 Jolt 应该做的事情的界限。
一个枢轴和一些重新映射是可以维护的,但这已经足够复杂了,如果出现问题/您的数据很奇怪,将很难调试。
需要 5 个步骤。两个用于将 STATUS 从字符串修复为布尔值。两个用于数据透视和子透视。最后一个把所有东西都放在正确的最后位置。
我建议检查每个步骤检查 Jolt 演示站点自己的 tab/copy 中的每个步骤,以查看/理解每个步骤在做什么。
规格
[
{
// ninja in a true and false value so that
// Status "Active" / "Inactive" can be "mapped" to booleans
"operation": "default",
"spec": {
"*": {
"FALSE": false,
"TRUE": true
}
}
},
{
// fix STATUS
"operation": "shift",
"spec": {
"*": {
//
"STATUS": {
// Match "Active" as make STATUS be true
"Active": {
"@(2,TRUE)": "[&3].STATUS"
},
// Everything else set to false
"*": {
"@(2,FALSE)": "[&3].STATUS"
}
},
// match and discard TRUE and FALSE
"TRUE|FALSE": null,
// pass everything else thru
"*": "[&1].&"
}
}
},
{
// now, group by PK value
"operation": "shift",
"spec": {
// top level array
"*": {
"PK": {
"*": { // match any value of PK
// go back up and grab the whole block and write
// it to the ouput where the key, is the value of PK
"@2": "&1[]"
}
}
}
}
},
{
// sub group by CONTACT_TYPE, with the complication of
// pulling one entry off to serve as the "Reference"
"operation": "shift",
"spec": {
"*": { // pk value
"0": { // special case the Zeroth item so that
// we can pull off once copy to serve as the
// Reference
"@": "&2.Reference",
// sub group by CONTACT_TYPE
"CONTACT_TYPE": {
"*": {
"@2": "&4.ContactMethods.&1[]"
}
}
},
"*": { // all the rest of the array indicies
// sub group by CONTACT_TYPE
"CONTACT_TYPE": {
"*": {
"@2": "&4.ContactMethods.&1[]"
}
}
}
}
}
},
{
// Data fixing and Grouping done, now put everything
// in its final place
"operation": "shift",
"spec": {
"*": { // top level pk
"Reference": {
"PK": "Accounts[#3].Reference.Key",
"SOURCE_DB": "Accounts[#3].Reference.System"
},
"ContactMethods": {
"Phone": {
"*": {
"CONTACT_SUBTYPE": "Accounts[#5].ContactMethods.Phone[&1].Subtype",
"PHONE_COUNTRY_CODE": "Accounts[#5].ContactMethods.Phone[&1].CountryCode",
"PHONE_NUMBER": "Accounts[#5].ContactMethods.Phone[&1].Number",
"PHONE_EXT": "Accounts[#5].ContactMethods.Phone[&1].Extension",
"STATUS": "Accounts[#5].ContactMethods.Phone[&1].Active"
}
},
"Email": {
"*": {
"CONTACT_SUBTYPE": "Accounts[#5].ContactMethods.Email[&1].Subtype",
"EMAIL": "Accounts[#5].ContactMethods.Email[&1].EmailAddress",
"STATUS": "Accounts[#5].ContactMethods.Email[&1].Active"
}
}
}
}
}
}
]