将字符串数组平均拆分为子数组,同时使用带有 jq 的过滤器
Split string array evenly into sub arrays whilst using a filter with jq
鉴于我有以下 json
[
"/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"/home/test-spa/src/directory/modals/tests/index.test.ts",
]
- 我想排除字符串目录或其他目录的任何内容
- 然后我想将数组拆分为 4 个数组,但我想平均拆分字符串中具有 integration 的任何内容,即我不希望所有集成都在一个数组中.然后可以将任何其他字符串拆分到 4 个数组中。
我想使用 jq 来执行此过滤器。以下代码允许我将 json 拆分为 4,但不执行上述所需的过滤。
jq -cM '[_nwise(length / 4 | floor)]'
因此我正在寻找类似于以下输出的内容(只要集成测试尽可能均匀地拆分,其他字符串就可以均匀填充并且顺序无关紧要)
[
[
"/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
]
]
如果桶的数量是预先确定的
这是一个通用的“循环法”函数,编写后可以有效地执行“有”和“没有”字符串的分布(即,无需连接任何数组):
# s is a stream, $n a predetermined number of buckets
def roundrobin(s; $n):
reduce s as $s ({n: 0, a: []}; .a[.n % $n] += [$s] | .n+=1) | .a;
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Perform the required round-robin:
| roundrobin( (.[] | select(index("integration"))),
(.[] | select(index("integration")|not)); 4)
如果桶数是数据驱动的
如果桶的数量应该取决于指定字符串的出现次数,那么使用上面定义的 roundrobin
过滤器,一个合理有效的解决方案可以写成如下:
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Form an array of the strings with the specified substring
| map(select(index("integration"))) as $has
# Perform the required round-robin:
| roundrobin( $has[], ((.-$has)[]); $has|length)
这是我想到的,分成 N 个桶:
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
我在你的输入中标记了每一行,这样我可以更容易地跟踪它们,然后添加了几行额外的行,这样结果就不会被你想要的桶数整除,以确保它会平衡出色地。 I 和 J 行应该被过滤掉。
<~> $ jq . /tmp/so.json
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"IX/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"JX/home/test-spa/src/directory/modals/tests/index.test.ts",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx"
]
脚本如上:
<~> $ cat /tmp/so.jq
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
分成 4 个桶:
<~> $ jq --arg num_buckets 4 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx"
],
[
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
]
]
改为分为 3 个桶:
<~> $ jq --arg num_buckets 3 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
]
]
要有一个默认的桶大小,你可以这样做:
bucket_shift($ARGS.named["num_buckets"] // 4|tonumber)
鉴于我有以下 json
[
"/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"/home/test-spa/src/directory/modals/tests/index.test.ts",
]
- 我想排除字符串目录或其他目录的任何内容
- 然后我想将数组拆分为 4 个数组,但我想平均拆分字符串中具有 integration 的任何内容,即我不希望所有集成都在一个数组中.然后可以将任何其他字符串拆分到 4 个数组中。
我想使用 jq 来执行此过滤器。以下代码允许我将 json 拆分为 4,但不执行上述所需的过滤。
jq -cM '[_nwise(length / 4 | floor)]'
因此我正在寻找类似于以下输出的内容(只要集成测试尽可能均匀地拆分,其他字符串就可以均匀填充并且顺序无关紧要)
[
[
"/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
],
[
"/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
]
]
如果桶的数量是预先确定的
这是一个通用的“循环法”函数,编写后可以有效地执行“有”和“没有”字符串的分布(即,无需连接任何数组):
# s is a stream, $n a predetermined number of buckets
def roundrobin(s; $n):
reduce s as $s ({n: 0, a: []}; .a[.n % $n] += [$s] | .n+=1) | .a;
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Perform the required round-robin:
| roundrobin( (.[] | select(index("integration"))),
(.[] | select(index("integration")|not)); 4)
如果桶数是数据驱动的
如果桶的数量应该取决于指定字符串的出现次数,那么使用上面定义的 roundrobin
过滤器,一个合理有效的解决方案可以写成如下:
# First exclude the unwanted elements:
map(select(test("(other-)?directory")|not))
# Form an array of the strings with the specified substring
| map(select(index("integration"))) as $has
# Perform the required round-robin:
| roundrobin( $has[], ((.-$has)[]); $has|length)
这是我想到的,分成 N 个桶:
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
我在你的输入中标记了每一行,这样我可以更容易地跟踪它们,然后添加了几行额外的行,这样结果就不会被你想要的桶数整除,以确保它会平衡出色地。 I 和 J 行应该被过滤掉。
<~> $ jq . /tmp/so.json
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"IX/home/test-spa/src/other-directory/modals/tests/index.test.ts",
"JX/home/test-spa/src/directory/modals/tests/index.test.ts",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx"
]
脚本如上:
<~> $ cat /tmp/so.jq
def bucket_shift($n):
# loop through all input, shift each elem into bucket
reduce .[] as $elem ( { count: 0, rv: [] };
(.rv[(.count % $n)] += [$elem] | .count += 1))
| .rv ;
# get rid of everything with directory or other-directory
[ .[] | select(test("directory|other-directory") | not) ]
# grab all lines with "integration" in an array
| [ ([ .[] | select(test("integration")) ]),
# grab all lines without "integration" into a second array
([ .[] | select(test("integration") | not) ]) ]
# flatten and divide into buckets (arg passed in)
| flatten | bucket_shift($num_buckets|tonumber)
分成 4 个桶:
<~> $ jq --arg num_buckets 4 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx"
],
[
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
]
]
改为分为 3 个桶:
<~> $ jq --arg num_buckets 3 -f /tmp/so.jq /tmp/so.json
[
[
"A/home/test-spa/src/components/modals/super-admin/tests/integration/index.test.tsx",
"F/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"D/home/test-spa/src/components/modals/delete-admin/tests/index.test.tsx",
"K/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
],
[
"C/home/test-spa/src/components/modals/edit-admin/tests/integration/index.test.tsx",
"L/home/test-spa/src/components/modals/add-admin/tests/integration/index.test.tsx",
"G/home/test-spa/src/components/modals/edit-user/tests/index.test.tsx"
],
[
"E/home/test-spa/src/components/modals/add-user/tests/integration/index.test.tsx",
"B/home/test-spa/src/components/modals/delete-user/tests/index.test.tsx",
"H/home/test-spa/src/components/modals/change-user/tests/index.test.tsx"
]
]
要有一个默认的桶大小,你可以这样做:
bucket_shift($ARGS.named["num_buckets"] // 4|tonumber)