通过颠簸变换复制具有不同配置的文档
Copying document with varying config through jolt transform
我的目标是获取一个在子树中有一个数组的输入文档,并将整个文档复制到一个文档副本数组中,并在每个后续副本中设置该数组中的各个值。
举个例子:
起始文档:
{
"config": {
"activeConfig": {
"sourceDatabase": "test",
"targetSites": [
{
"siteName": "location1",
"targetDatabase": "devl",
"siteShortName": "123"
},
{
"siteName": "location2",
"targetDatabase": "123",
"siteShortName": "123"
}
]
}
},
"secondData": {
"queries": [
{
"Tablename": "abc",
"Query": "123"
}
]
}
}
预期输出:
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
},
{
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
目前我的 JOLT 规范如下:
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[]",
"@": "[].config.activeConfig.currentSite"
}
}
}
}
}
}
]
这让我很接近,但还不够。
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"currentSite" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}
}
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"currentSite" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
}
} ]
此规范创建了我正在寻找的结构,但没有合并它们。所以我的最终数组最终包含 4 个项目,原始文档的 2 个副本,以及配置数组中的两个项目。我的目标是将配置数组中的这两项合并到文档副本中,因此我有原始文档的两份副本,每份都配置了一个值。
我唯一了解的其他规范是
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[&]",
"@": "[&].config.activeConfig.currentSite"
}
}
}
}
}
}
]
这导致最终数组中有两个文档副本,但 currentSite 部分在每个副本中以配置数组中的所有值结束,而不是每个副本 1 个
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
(至于为什么,本文档的下一步将在 NiFi 流中将其拆分为两个流文件,这将允许每个文件单独配置)
感谢您提供的任何意见或帮助。
更新:
发现另一个我难以理解的有趣行为。
当我使用以下规范时,我得到了一个对我来说没有意义的输出。
规格:
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[&]",
"@": "[&].config.activeConfig.currentSite&"
}
}
}
}
}
}
]
输出:
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite0" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
},
"currentSite1" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite0" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
},
"currentSite1" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
我尝试更改输出路径 "@": "[&].config.activeConfig.currentSite&" 以在两个地方使用 &。这与我上面的第二个示例的行为类似,其中两个值都在两个副本中结束,但您可以看到,在这种情况下,一个在 currentSite0 中结束,一个在 currentSite1 中结束,在两个数组索引 0 和 1 中。这意味着 &在表达式“[&].config.activeConfig.currentSite&”中计算时,它的行为就像同时具有值 0 和 1。我非常明显地错过了行为的一些细微差别。
必须使用两班倒。一般来说,在对数组进行 "stuff" 时,必须对每个 "thing" 尝试进行移位操作。
在您的情况下,您 1) 想要将内容复制到输出数组中,以及 2) 复制特定的目标站点。
规格
[
// Step 1: Make the copies of the input data, based on the number
// of items in the targetSites array.
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": { // targetSites array index
// go back up 4 levels and grab the whole tree "@4"
// and write it to the output as a top level array
// indexed by the "targetSites array index"
"@4": "[&1]"
}
}
}
}
}
},
{
// Step 2 : Annoyingly copy everything across, but use the
// value of the top level array index, to copy the "right"
// data out of the targetSites array.
"operation": "shift",
"spec": {
"*": { // top level array index
"config": {
"sourceDatabase": "[&2].config.sourceDatabase", // straight copy across
"activeConfig": {
"targetSites": {
"@": "[&4].config.activeConfig.targetSites", // straight copy across
//
// Nifty but very rarely used feature.
// Use "&3" to lookup the "current" value of the top level array index
// and then use that as an index into the targetSites array, and copy
// that across as "currentSite"
"&3": "[&4].config.activeConfig.currentSite"
}
}
},
"secondData": "[&1].secondData" // straight copy across
}
}
}
]
我的目标是获取一个在子树中有一个数组的输入文档,并将整个文档复制到一个文档副本数组中,并在每个后续副本中设置该数组中的各个值。
举个例子:
起始文档:
{
"config": {
"activeConfig": {
"sourceDatabase": "test",
"targetSites": [
{
"siteName": "location1",
"targetDatabase": "devl",
"siteShortName": "123"
},
{
"siteName": "location2",
"targetDatabase": "123",
"siteShortName": "123"
}
]
}
},
"secondData": {
"queries": [
{
"Tablename": "abc",
"Query": "123"
}
]
}
}
预期输出:
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
},
{
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
目前我的 JOLT 规范如下:
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[]",
"@": "[].config.activeConfig.currentSite"
}
}
}
}
}
}
]
这让我很接近,但还不够。
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"currentSite" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}
}
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"currentSite" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
}
} ]
此规范创建了我正在寻找的结构,但没有合并它们。所以我的最终数组最终包含 4 个项目,原始文档的 2 个副本,以及配置数组中的两个项目。我的目标是将配置数组中的这两项合并到文档副本中,因此我有原始文档的两份副本,每份都配置了一个值。
我唯一了解的其他规范是
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[&]",
"@": "[&].config.activeConfig.currentSite"
}
}
}
}
}
}
]
这导致最终数组中有两个文档副本,但 currentSite 部分在每个副本中以配置数组中的所有值结束,而不是每个副本 1 个
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ]
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
(至于为什么,本文档的下一步将在 NiFi 流中将其拆分为两个流文件,这将允许每个文件单独配置)
感谢您提供的任何意见或帮助。
更新:
发现另一个我难以理解的有趣行为。
当我使用以下规范时,我得到了一个对我来说没有意义的输出。
规格:
[
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": {
"@4": "[&]",
"@": "[&].config.activeConfig.currentSite&"
}
}
}
}
}
}
]
输出:
[ {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite0" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
},
"currentSite1" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
}, {
"config" : {
"activeConfig" : {
"sourceDatabase" : "test",
"targetSites" : [ {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
}, {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
} ],
"currentSite0" : {
"siteName" : "location1",
"targetDatabase" : "devl",
"siteShortName" : "123"
},
"currentSite1" : {
"siteName" : "location2",
"targetDatabase" : "123",
"siteShortName" : "123"
}
}
},
"secondData" : {
"queries" : [ {
"Tablename" : "abc",
"Query" : "123"
} ]
}
} ]
我尝试更改输出路径 "@": "[&].config.activeConfig.currentSite&" 以在两个地方使用 &。这与我上面的第二个示例的行为类似,其中两个值都在两个副本中结束,但您可以看到,在这种情况下,一个在 currentSite0 中结束,一个在 currentSite1 中结束,在两个数组索引 0 和 1 中。这意味着 &在表达式“[&].config.activeConfig.currentSite&”中计算时,它的行为就像同时具有值 0 和 1。我非常明显地错过了行为的一些细微差别。
必须使用两班倒。一般来说,在对数组进行 "stuff" 时,必须对每个 "thing" 尝试进行移位操作。
在您的情况下,您 1) 想要将内容复制到输出数组中,以及 2) 复制特定的目标站点。
规格
[
// Step 1: Make the copies of the input data, based on the number
// of items in the targetSites array.
{
"operation": "shift",
"spec": {
"config": {
"activeConfig": {
"targetSites": {
"*": { // targetSites array index
// go back up 4 levels and grab the whole tree "@4"
// and write it to the output as a top level array
// indexed by the "targetSites array index"
"@4": "[&1]"
}
}
}
}
}
},
{
// Step 2 : Annoyingly copy everything across, but use the
// value of the top level array index, to copy the "right"
// data out of the targetSites array.
"operation": "shift",
"spec": {
"*": { // top level array index
"config": {
"sourceDatabase": "[&2].config.sourceDatabase", // straight copy across
"activeConfig": {
"targetSites": {
"@": "[&4].config.activeConfig.targetSites", // straight copy across
//
// Nifty but very rarely used feature.
// Use "&3" to lookup the "current" value of the top level array index
// and then use that as an index into the targetSites array, and copy
// that across as "currentSite"
"&3": "[&4].config.activeConfig.currentSite"
}
}
},
"secondData": "[&1].secondData" // straight copy across
}
}
}
]