如何将 YAML 数据解析为自定义 Bash 数据 array/hash 结构?
How to parse YAML data into a custom Bash data array/hash structure?
我有以下 YAML 文件:
site:
title: My blog
domain: example.com
author1:
name: bob
url: /author/bob
author2:
name: jane
url: /author/jane
header_links:
about:
title: About
url: about.html
contact:
title: Contact Us
url: contactus.html
js_deps:
- cashjs
- jets
products:
product1:
name: Prod One
price: 10
product2:
name: Prod Two
price: 20
我想要一个 Bash、Python 或 AWK 函数或脚本,可以将上面的 YAML 文件作为输入 (</code>),并且 <strong>生成然后执行</strong>以下代码(或完全等效的代码):</p>
<pre><code>unset site_title
unset site_domain
unset site_author1
unset site_author2
unset site_header_links
unset site_header_links_about
unset site_header_links_contact
unset js_deps
site_title="My blog"
site_domain="example.com"
declare -A site_author1
declare -A site_author2
site_author1=(
[name]="bob"
[url]="/author/bob"
)
site_author2=(
[name]="jane"
[url]="/author/jane"
)
declare -A site_header_links_about
declare -A site_header_links_contact
site_header_links_about=(
[name]="About"
[url]="about.html"
)
site_header_links_contact=(
[name]="Contact Us"
[url]="contact.html"
)
site_header_links=(site_header_links_about site_header_links_contact)
js_deps=(cashjs jets)
unset products
unset product1
unset product2
declare -A product1
declare -A product2
product1=(
[name]="Prod One"
[price]=10
)
product2=(
[name]="Prod Two"
[price]=20
)
products=(product1 product2)
所以,逻辑是:
通过 YAML,并创建带有字符串值的下划线连接变量名称,除了在最后(底部)级别,数据应创建为关联数组或索引数组,只要有可能...另外,任何创建的关联数组都应该在索引数组中按名称列出。
所以,换句话说:
只要最后一层数据可以变成关联数组,那么它应该是 (foo.bar.hash
=> ${foo_bar_hash[@]}
只要最后一级数据可以变成索引数组,那么它应该是 (foo.bar.list
=> ${foo_bar_list[@]}
每个关联数组都应在索引数组中按名称列出,索引数组以其在 yaml 数据中的父项命名(参见示例中的 products
)
否则,只需制作一个下划线连接的 var 名称并将值保存为字符串 (foo.bar.string
=> ${foo_bar_string}
...我需要这个特定的 Bash 数据结构的原因是我正在使用需要它的基于 Bash 的模板系统。
一旦我有了我需要的功能,我就可以在我的模板中轻松使用 YAML 数据,就像这样:
{{site_title}}
...
{{#foreach link in site_header_links}}
<a href="{{link.url}}">{{link.name}}</a>
{{/foreach}}
...
{{#js_deps}}
{{.}}
{{/js_deps}}
...
{{#foreach item in products}}
{{item.name}}
{{item.price}}
{{/foreach}}
我尝试了什么:
这与我之前问的一个问题完全相关:
这太接近了,但是我还需要生成一个 site_header_links
的关联数组,好吧 ..它失败了,因为 site_header_links
是嵌套的太深了。
我仍然喜欢在解决方案中使用 https://github.com/azohra/yaml.sh,因为它也会为模板系统提供一个简单的把手式 lookup
剽窃 :)
编辑:
非常清楚:解决方案不能使用pip
、virtualenv
或任何其他需要单独安装的外部依赖——它必须是自包含的 script/func(就像 https://github.com/azohra/yaml.sh 一样),它可以存在于 CMS 项目目录中......或者我不需要在这里..
...
希望评论得当的答案可以帮助我避免回到这里 ;)
单凭纸牌游戏的规则是很难看出来的
看着人们玩一轮。以类似的方式
很难准确地看到 YAML 文件的 "rules" 是什么。
下面我也对root-level做了假设
作为一级、二级、三级节点,输出什么
产生。对节点做出假设也是有效的
基于它具有的操作级别parents,它更灵活(因为你
然后可以添加例如根级别的序列),但这会
实施起来有点困难。
保持声明和复合数组赋值穿插
另一个代码和 "similar" 项目分组有点麻烦。
为此,您需要跟踪节点类型的转换(str,
dict, nested dict) 并对其进行分组。所以每个根级密钥我转储所有
unset
首先,然后是所有声明,然后是所有赋值,然后是 al
复合作业。我认为这属于“确切的事情
相当于”。
因为 products
-> product1
/product2
被完全处理
不同于具有相同节点的 site
-> author1
/authro2
结构,我做了一个单独的函数来处理每个根级键。
为了 运行 你应该为 Python (3.7/3.6) 设置一个虚拟环境,安装
里面的 YAML 库:
$ python -m venv /opt/util/yaml2bash
$ /opt/util/yaml2bash/bin/pip install ruamel.yaml
然后存储以下程序,例如在 /opt/util/yaml2bash/bin/yaml2bash
并使其可执行 (chmod +x /opt/util/yaml2bash/bin/yaml2bash
)
#! /opt/util/yaml2bash/bin/python
import sys
from pathlib import Path
import ruamel.yaml
if len(sys.argv) > 0:
input = Path(sys.argv[1])
else:
input = sys.stdin
def bash_site(k0, v0, fp):
"""this function takes a root-level key and its value (v0 a dict), constructs the
list of unsets and outputs based on the keys, values and type of values of v0,
then dumps these to fp
"""
unsets = []
declares = []
assignments = []
compounds = {}
for k1, v1 in v0.items():
if isinstance(v1, str):
k = k0 + '_' + k1
unsets.append(k)
assignments.append(f'{k}="{v1}"')
elif isinstance(v1, dict):
first_val = list(v1.values())[0]
if isinstance(first_val, str):
k = k0 + '_' + k1
unsets.append(k)
declares.append(k)
assignments.append(f'{k}=(')
for k2, v2 in v1.items():
q = '"' if isinstance(v2, str) else ''
assignments.append(f' [{k2}]={q}{v2}{q}')
assignments.append(')')
elif isinstance(first_val, dict):
for k2, v2 in v1.items(): # assume all the same type
k = k0 + '_' + k1 + '_' + k2
unsets.append(k)
declares.append(k)
assignments.append(f'{k}=(')
for k3, v3 in v2.items():
q = '"' if isinstance(v3, str) else ''
assignments.append(f' [{k2}]={q}{v3}{q}')
assignments.append(')')
compounds.setdefault(k0 + '_' + k1, []).append(k)
else:
raise NotImplementedError("unknown val: " + repr(first_val))
elif isinstance(v1, list):
unsets.append(k1)
compounds[k1] = v1
else:
raise NotImplementedError("unknown val: " + repr(v1))
if unsets:
for item in unsets:
print('unset', item, file=fp)
print(file=fp)
if declares:
for item in declares:
print('declare -A', item, file=fp)
print(file=fp)
if assignments:
for item in assignments:
print(item, file=fp)
print(file=fp)
if compounds:
for k in compounds:
v = ' '.join(compounds[k])
print(f'{k}=({v})', file=fp)
print(file=fp)
def bash_products(k0, v0, fp):
"""this function takes a root-level key and its value (v0 a dict), constructs the
list of unsets and outputs based on the keys, values and type of values of v0,
then dumps these to fp
"""
unsets = [k0]
declares = []
assignments = []
compounds = {}
for k1, v1 in v0.items():
if isinstance(v1, dict):
first_val = list(v1.values())[0]
if isinstance(first_val, str):
unsets.append(k1)
declares.append(k1)
assignments.append(f'{k1}=(')
for k2, v2 in v1.items():
q = '"' if isinstance(v2, str) else ''
assignments.append(f' [{k2}]={q}{v2}{q}')
assignments.append(')')
compounds.setdefault(k0, []).append(k1)
else:
raise NotImplementedError("unknown val: " + repr(first_val))
else:
raise NotImplementedError("unknown val: " + repr(v1))
if unsets:
for item in unsets:
print('unset', item, file=fp)
print(file=fp)
if declares:
for item in declares:
print('declare -A', item, file=fp)
print(file=fp)
if assignments:
for item in assignments:
print(item, file=fp)
print(file=fp)
if compounds:
for k in compounds:
v = ' '.join(compounds[k])
print(f'{k}=({v})', file=fp)
print(file=fp)
yaml = ruamel.yaml.YAML()
data = yaml.load(input)
output = sys.stdout # make it easier to redirect to file if necessary at some point in the future
bash_site('site', data['site'], output)
bash_products('products', data['products'], output)
如果你 运行 这个程序并提供你的 YAML 输入文件作为
参数 (/opt/util/yaml2bash/bin/yaml2bash input.yaml
) 给出:
unset site_title
unset site_domain
unset site_author1
unset site_author2
unset site_header_links_about
unset site_header_links_contact
unset js_deps
declare -A site_author1
declare -A site_author2
declare -A site_header_links_about
declare -A site_header_links_contact
site_title="My blog"
site_domain="example.com"
site_author1=(
[name]="bob"
[url]="/author/bob"
)
site_author2=(
[name]="jane"
[url]="/author/jane"
)
site_header_links_about=(
[about]="About"
[about]="about.html"
)
site_header_links_contact=(
[contact]="Contact Us"
[contact]="contactus.html"
)
site_header_links=(site_header_links_about site_header_links_contact)
js_deps=(cashjs jets)
unset products
unset product1
unset product2
declare -A product1
declare -A product2
product1=(
[name]="Prod One"
[price]=10
)
product2=(
[name]="Prod Two"
[price]=20
)
products=(product1 product2)
您可以使用 source $(/opt/util/yaml2bash/bin/yaml2bash input.yaml)
在 bash.
中获取所有这些值
请注意,所有 YAML 文件中的双引号都是多余的。
使用 Python 和 ruamel.yaml(免责声明我是那个的作者
包)给你一个完整的 YAML 解析器,例如允许您使用评论和 flow-style
collections:
jsdeps: [cashjs, jets] # more compact
如果你被困在几乎 end-of-life Python 2.7 并且不能完全控制你的机器(在这种情况下你应该 install/compile Python 3.7为此),您仍然可以使用 ruamel yaml。
- 决定你的程序在哪里,例如
~/bin
- 创建
~/bin/ruamel
(按1调整。)
cd ~/bin/ruamel
touch __init__.py
- 从 PyPI
下载 latest tar file
- 解压 tar 文件并将生成的目录从 ruamel.yaml-X.Y.Z 重命名为
yaml
ruamel.yaml
应该可以在没有依赖项的情况下工作。在 2.7 上,ruamel.ordereddict
和 ruamel.yaml.clib
为 speed-up.
提供了 C 版本的基本例程
上面的程序需要重写一点(f-strings -> "".format()
和 pathlib.Path
-> 老式的 with open(...) as fp:
我决定使用以下组合:
Yay的破解版:
- 添加了对简单列表的支持
- 修复了多个缩进级别
this yaml parser的破解版:
- 使用从 Yay 借来的前缀内容,以保持一致性
function yaml_to_vars {
# find input file
for f in "" ".yay" ".yml"
do
[[ -f "$f" ]] && input="$f" && break
done
[[ -z "$input" ]] && exit 1
# use given dataset prefix or imply from file name
[[ -n "" ]] && local prefix="" || {
local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix="${prefix//-/_}_";
}
local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '4')
sed -ne "s|,$s\]$s$|]|" \
-e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|: []\n - |;t1" \
-e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|:\n - |;p" | \
sed -ne "s|,$s}$s$|}|" \
-e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|- {}\n : |;t1" \
-e "s|^\($s\)-$s{$s\(.*\)$s}|-\n |;p" | \
sed -ne "s|^\($s\):||" \
-e "s|^\($s\)-$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)-$s\(.*\)$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" | \
awk -F$fs '{
indent = length()/2;
vname[indent] = ;
for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
if(length()== 0){ vname[indent]= ++idx[indent] };
if (length() > 0) {
vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], );
}
}'
}
yay_parse() {
# find input file
for f in "" ".yay" ".yml"
do
[[ -f "$f" ]] && input="$f" && break
done
[[ -z "$input" ]] && exit 1
# use given dataset prefix or imply from file name
[[ -n "" ]] && local prefix="" || {
local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix=${prefix//-/_};
}
echo "unset $prefix; declare -g -a $prefix;"
local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '4')
#sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s$|$fs$fs|p" \
# -e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" "$input" |
sed -ne "s|,$s\]$s$|]|" \
-e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|: []\n - |;t1" \
-e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|:\n - |;p" | \
sed -ne "s|,$s}$s$|}|" \
-e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|- {}\n : |;t1" \
-e "s|^\($s\)-$s{$s\(.*\)$s}|-\n |;p" | \
sed -ne "s|^\($s\):||" \
-e "s|^\($s\)-$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)-$s\(.*\)$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" | \
awk -F$fs '{
indent = length()/2;
key = ;
value = ;
# No prefix or parent for the top level (indent zero)
root_prefix = "'$prefix'_";
if (indent == 0) {
prefix = ""; parent_key = "'$prefix'";
} else {
prefix = root_prefix; parent_key = keys[indent-1];
}
keys[indent] = key;
# remove keys left behind if prior row was indented more than this row
for (i in keys) {if (i > indent) {delete keys[i]}}
# if we have a value
if (length(value) > 0) {
# set values here
# if the "key" is missing, make array indexed, not assoc..
if (length(key) == 0) {
# array item has no key, only a value..
# so, if we didnt already unset the assoc array
if (unsetArray == 0) {
# unset the assoc array here
printf("unset %s%s; ", prefix, parent_key);
# switch the flag, so we only unset once, before adding values
unsetArray = 1;
}
# array was unset, has no key, so add item using indexed array syntax
printf("%s%s+=(\"%s\");\n", prefix, parent_key, value);
} else {
# array item has key and value, add item using assoc array syntax
printf("%s%s[%s]=\"%s\";\n", prefix, parent_key, key, value);
}
} else {
# declare arrays here
# reset this flag for each new array we work on...
unsetArray = 0;
# if item has no key, declare indexed array
if (length(key) == 0) {
# indexed
printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);
# if item has numeric key, declare indexed array
} else if (key ~ /^[[:digit:]]/) {
printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);
# else (item has a string for a key), declare associative array
} else {
printf("unset %s%s; declare -g -A %s%s;\n", root_prefix, key, root_prefix, key);
}
# set root level values here
if (indent > 0) {
# add to associative array
printf("%s%s[%s]+=\"%s%s\";\n", prefix, parent_key , key, root_prefix, key);
} else {
# add to indexed array
printf("%s%s+=( \"%s%s\");\n", prefix, parent_key , root_prefix, key);
}
}
}'
}
# helper to load yay data file
yay() {
# yaml_to_vars "$@" ## uncomment to debug (prints data to stdout)
eval $(yaml_to_vars "$@")
# yay_parse "$@" ## uncomment to debug (prints data to stdout)
eval $(yay_parse "$@")
}
使用上面的代码,当products.yml
包含:
product1
name: Foo
price: 100
product2
name: Bar
price: 200
解析器可以这样调用:
source path/to/yml-parser.sh
yay products.yml
它生成并计算此代码:
products_product1_name="Foo"
products_product1_price="100"
products_product2_name="Bar"
products_product2_price="200"
unset products;
declare -g -a products;
unset products_product1;
declare -g -A products_product1;
products+=( "products_product1");
products_product1[name]="Foo";
products_product1[price]="100";
unset products_product2;
declare -g -A products_product2;
products+=( "products_product2");
products_product2[name]="Bar";
products_product2[price]="200";
所以,我得到以下 Bash 数组和变量:
declare -a products=([0]="products_product1" [1]="products_product2")
declare -A products_product1=([price]="100" [name]="Foo" )
declare -A products_product2=([price]="200" [name]="Bar" )
在我的模板系统中,我现在可以像这样访问这个 yml 数据:
{{#foreach product in products}}
Name: {{product.name}}
Price: {{product.price}}
{{/foreach}}
:)
另一个例子:
文件site.yml
meta_info:
title: My cool blog
domain: foo.github.io
author1:
name: bob
url: /author/bob
author2:
name: jane
url: /author/jane
header_links:
link1:
title: About
url: about.html
link2:
title: Contact Us
url: contactus.html
js_deps:
cashjs: cashjs
jets: jets
Foo:
- one
- two
- three
产生:
declare -a site=([0]="site_meta_info" [1]="site_author1" [2]="site_author2" [3]="site_header_links" [4]="site_js_deps" [5]="site_Foo")
declare -A site_meta_info=([title]="My cool blog" [domain]="foo.github.io" )
declare -A site_author1=([url]="/author/bob" [name]="bob" )
declare -A site_author2=([url]="/author/jane" [name]="jane" )
declare -A site_header_links=([link1]="site_link1" [link2]="site_link2" )
declare -A site_link1=([url]="about.html" [title]="About" )
declare -A site_link2=([url]="contactus.html" [title]="Contact Us" )
declare -A site_js_deps=([cashjs]="cashjs" [jets]="jets" )
declare -a site_Foo=([0]="one" [1]="two" [2]="three")
在我的模板中,我可以像这样访问 site_header_links
:
{{#foreach link in site_header_links}}
* {{link.title}} - {{link.url}}
{{/foreach}}
和site_Foo
(破折号,或简单的列表)像这样:
{{#site_Foo}}
* {{.}}
{{/site_Foo}}
我有以下 YAML 文件:
site:
title: My blog
domain: example.com
author1:
name: bob
url: /author/bob
author2:
name: jane
url: /author/jane
header_links:
about:
title: About
url: about.html
contact:
title: Contact Us
url: contactus.html
js_deps:
- cashjs
- jets
products:
product1:
name: Prod One
price: 10
product2:
name: Prod Two
price: 20
我想要一个 Bash、Python 或 AWK 函数或脚本,可以将上面的 YAML 文件作为输入 (</code>),并且 <strong>生成然后执行</strong>以下代码(或完全等效的代码):</p>
<pre><code>unset site_title
unset site_domain
unset site_author1
unset site_author2
unset site_header_links
unset site_header_links_about
unset site_header_links_contact
unset js_deps
site_title="My blog"
site_domain="example.com"
declare -A site_author1
declare -A site_author2
site_author1=(
[name]="bob"
[url]="/author/bob"
)
site_author2=(
[name]="jane"
[url]="/author/jane"
)
declare -A site_header_links_about
declare -A site_header_links_contact
site_header_links_about=(
[name]="About"
[url]="about.html"
)
site_header_links_contact=(
[name]="Contact Us"
[url]="contact.html"
)
site_header_links=(site_header_links_about site_header_links_contact)
js_deps=(cashjs jets)
unset products
unset product1
unset product2
declare -A product1
declare -A product2
product1=(
[name]="Prod One"
[price]=10
)
product2=(
[name]="Prod Two"
[price]=20
)
products=(product1 product2)
所以,逻辑是:
通过 YAML,并创建带有字符串值的下划线连接变量名称,除了在最后(底部)级别,数据应创建为关联数组或索引数组,只要有可能...另外,任何创建的关联数组都应该在索引数组中按名称列出。
所以,换句话说:
只要最后一层数据可以变成关联数组,那么它应该是 (
foo.bar.hash
=>${foo_bar_hash[@]}
只要最后一级数据可以变成索引数组,那么它应该是 (
foo.bar.list
=>${foo_bar_list[@]}
每个关联数组都应在索引数组中按名称列出,索引数组以其在 yaml 数据中的父项命名(参见示例中的
products
)否则,只需制作一个下划线连接的 var 名称并将值保存为字符串 (
foo.bar.string
=>${foo_bar_string}
...我需要这个特定的 Bash 数据结构的原因是我正在使用需要它的基于 Bash 的模板系统。
一旦我有了我需要的功能,我就可以在我的模板中轻松使用 YAML 数据,就像这样:
{{site_title}}
...
{{#foreach link in site_header_links}}
<a href="{{link.url}}">{{link.name}}</a>
{{/foreach}}
...
{{#js_deps}}
{{.}}
{{/js_deps}}
...
{{#foreach item in products}}
{{item.name}}
{{item.price}}
{{/foreach}}
我尝试了什么:
这与我之前问的一个问题完全相关:
这太接近了,但是我还需要生成一个 site_header_links
的关联数组,好吧 ..它失败了,因为 site_header_links
是嵌套的太深了。
我仍然喜欢在解决方案中使用 https://github.com/azohra/yaml.sh,因为它也会为模板系统提供一个简单的把手式 lookup
剽窃 :)
编辑:
非常清楚:解决方案不能使用pip
、virtualenv
或任何其他需要单独安装的外部依赖——它必须是自包含的 script/func(就像 https://github.com/azohra/yaml.sh 一样),它可以存在于 CMS 项目目录中......或者我不需要在这里..
...
希望评论得当的答案可以帮助我避免回到这里 ;)
单凭纸牌游戏的规则是很难看出来的 看着人们玩一轮。以类似的方式 很难准确地看到 YAML 文件的 "rules" 是什么。
下面我也对root-level做了假设 作为一级、二级、三级节点,输出什么 产生。对节点做出假设也是有效的 基于它具有的操作级别parents,它更灵活(因为你 然后可以添加例如根级别的序列),但这会 实施起来有点困难。
保持声明和复合数组赋值穿插
另一个代码和 "similar" 项目分组有点麻烦。
为此,您需要跟踪节点类型的转换(str,
dict, nested dict) 并对其进行分组。所以每个根级密钥我转储所有
unset
首先,然后是所有声明,然后是所有赋值,然后是 al
复合作业。我认为这属于“确切的事情
相当于”。
因为 products
-> product1
/product2
被完全处理
不同于具有相同节点的 site
-> author1
/authro2
结构,我做了一个单独的函数来处理每个根级键。
为了 运行 你应该为 Python (3.7/3.6) 设置一个虚拟环境,安装 里面的 YAML 库:
$ python -m venv /opt/util/yaml2bash
$ /opt/util/yaml2bash/bin/pip install ruamel.yaml
然后存储以下程序,例如在 /opt/util/yaml2bash/bin/yaml2bash
并使其可执行 (chmod +x /opt/util/yaml2bash/bin/yaml2bash
)
#! /opt/util/yaml2bash/bin/python
import sys
from pathlib import Path
import ruamel.yaml
if len(sys.argv) > 0:
input = Path(sys.argv[1])
else:
input = sys.stdin
def bash_site(k0, v0, fp):
"""this function takes a root-level key and its value (v0 a dict), constructs the
list of unsets and outputs based on the keys, values and type of values of v0,
then dumps these to fp
"""
unsets = []
declares = []
assignments = []
compounds = {}
for k1, v1 in v0.items():
if isinstance(v1, str):
k = k0 + '_' + k1
unsets.append(k)
assignments.append(f'{k}="{v1}"')
elif isinstance(v1, dict):
first_val = list(v1.values())[0]
if isinstance(first_val, str):
k = k0 + '_' + k1
unsets.append(k)
declares.append(k)
assignments.append(f'{k}=(')
for k2, v2 in v1.items():
q = '"' if isinstance(v2, str) else ''
assignments.append(f' [{k2}]={q}{v2}{q}')
assignments.append(')')
elif isinstance(first_val, dict):
for k2, v2 in v1.items(): # assume all the same type
k = k0 + '_' + k1 + '_' + k2
unsets.append(k)
declares.append(k)
assignments.append(f'{k}=(')
for k3, v3 in v2.items():
q = '"' if isinstance(v3, str) else ''
assignments.append(f' [{k2}]={q}{v3}{q}')
assignments.append(')')
compounds.setdefault(k0 + '_' + k1, []).append(k)
else:
raise NotImplementedError("unknown val: " + repr(first_val))
elif isinstance(v1, list):
unsets.append(k1)
compounds[k1] = v1
else:
raise NotImplementedError("unknown val: " + repr(v1))
if unsets:
for item in unsets:
print('unset', item, file=fp)
print(file=fp)
if declares:
for item in declares:
print('declare -A', item, file=fp)
print(file=fp)
if assignments:
for item in assignments:
print(item, file=fp)
print(file=fp)
if compounds:
for k in compounds:
v = ' '.join(compounds[k])
print(f'{k}=({v})', file=fp)
print(file=fp)
def bash_products(k0, v0, fp):
"""this function takes a root-level key and its value (v0 a dict), constructs the
list of unsets and outputs based on the keys, values and type of values of v0,
then dumps these to fp
"""
unsets = [k0]
declares = []
assignments = []
compounds = {}
for k1, v1 in v0.items():
if isinstance(v1, dict):
first_val = list(v1.values())[0]
if isinstance(first_val, str):
unsets.append(k1)
declares.append(k1)
assignments.append(f'{k1}=(')
for k2, v2 in v1.items():
q = '"' if isinstance(v2, str) else ''
assignments.append(f' [{k2}]={q}{v2}{q}')
assignments.append(')')
compounds.setdefault(k0, []).append(k1)
else:
raise NotImplementedError("unknown val: " + repr(first_val))
else:
raise NotImplementedError("unknown val: " + repr(v1))
if unsets:
for item in unsets:
print('unset', item, file=fp)
print(file=fp)
if declares:
for item in declares:
print('declare -A', item, file=fp)
print(file=fp)
if assignments:
for item in assignments:
print(item, file=fp)
print(file=fp)
if compounds:
for k in compounds:
v = ' '.join(compounds[k])
print(f'{k}=({v})', file=fp)
print(file=fp)
yaml = ruamel.yaml.YAML()
data = yaml.load(input)
output = sys.stdout # make it easier to redirect to file if necessary at some point in the future
bash_site('site', data['site'], output)
bash_products('products', data['products'], output)
如果你 运行 这个程序并提供你的 YAML 输入文件作为
参数 (/opt/util/yaml2bash/bin/yaml2bash input.yaml
) 给出:
unset site_title
unset site_domain
unset site_author1
unset site_author2
unset site_header_links_about
unset site_header_links_contact
unset js_deps
declare -A site_author1
declare -A site_author2
declare -A site_header_links_about
declare -A site_header_links_contact
site_title="My blog"
site_domain="example.com"
site_author1=(
[name]="bob"
[url]="/author/bob"
)
site_author2=(
[name]="jane"
[url]="/author/jane"
)
site_header_links_about=(
[about]="About"
[about]="about.html"
)
site_header_links_contact=(
[contact]="Contact Us"
[contact]="contactus.html"
)
site_header_links=(site_header_links_about site_header_links_contact)
js_deps=(cashjs jets)
unset products
unset product1
unset product2
declare -A product1
declare -A product2
product1=(
[name]="Prod One"
[price]=10
)
product2=(
[name]="Prod Two"
[price]=20
)
products=(product1 product2)
您可以使用 source $(/opt/util/yaml2bash/bin/yaml2bash input.yaml)
在 bash.
请注意,所有 YAML 文件中的双引号都是多余的。
使用 Python 和 ruamel.yaml(免责声明我是那个的作者 包)给你一个完整的 YAML 解析器,例如允许您使用评论和 flow-style collections:
jsdeps: [cashjs, jets] # more compact
如果你被困在几乎 end-of-life Python 2.7 并且不能完全控制你的机器(在这种情况下你应该 install/compile Python 3.7为此),您仍然可以使用 ruamel yaml。
- 决定你的程序在哪里,例如
~/bin
- 创建
~/bin/ruamel
(按1调整。) cd ~/bin/ruamel
touch __init__.py
- 从 PyPI 下载 latest tar file
- 解压 tar 文件并将生成的目录从 ruamel.yaml-X.Y.Z 重命名为
yaml
ruamel.yaml
应该可以在没有依赖项的情况下工作。在 2.7 上,ruamel.ordereddict
和 ruamel.yaml.clib
为 speed-up.
上面的程序需要重写一点(f-strings -> "".format()
和 pathlib.Path
-> 老式的 with open(...) as fp:
我决定使用以下组合:
Yay的破解版:
- 添加了对简单列表的支持
- 修复了多个缩进级别
this yaml parser的破解版:
- 使用从 Yay 借来的前缀内容,以保持一致性
function yaml_to_vars {
# find input file
for f in "" ".yay" ".yml"
do
[[ -f "$f" ]] && input="$f" && break
done
[[ -z "$input" ]] && exit 1
# use given dataset prefix or imply from file name
[[ -n "" ]] && local prefix="" || {
local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix="${prefix//-/_}_";
}
local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '4')
sed -ne "s|,$s\]$s$|]|" \
-e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|: []\n - |;t1" \
-e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|:\n - |;p" | \
sed -ne "s|,$s}$s$|}|" \
-e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|- {}\n : |;t1" \
-e "s|^\($s\)-$s{$s\(.*\)$s}|-\n |;p" | \
sed -ne "s|^\($s\):||" \
-e "s|^\($s\)-$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)-$s\(.*\)$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" | \
awk -F$fs '{
indent = length()/2;
vname[indent] = ;
for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
if(length()== 0){ vname[indent]= ++idx[indent] };
if (length() > 0) {
vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], );
}
}'
}
yay_parse() {
# find input file
for f in "" ".yay" ".yml"
do
[[ -f "$f" ]] && input="$f" && break
done
[[ -z "$input" ]] && exit 1
# use given dataset prefix or imply from file name
[[ -n "" ]] && local prefix="" || {
local prefix=$(basename "$input"); prefix=${prefix%.*}; prefix=${prefix//-/_};
}
echo "unset $prefix; declare -g -a $prefix;"
local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '4')
#sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s$|$fs$fs|p" \
# -e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" "$input" |
sed -ne "s|,$s\]$s$|]|" \
-e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|: []\n - |;t1" \
-e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|:\n - |;p" | \
sed -ne "s|,$s}$s$|}|" \
-e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|- {}\n : |;t1" \
-e "s|^\($s\)-$s{$s\(.*\)$s}|-\n |;p" | \
sed -ne "s|^\($s\):||" \
-e "s|^\($s\)-$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)-$s\(.*\)$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s$|$fs$fs|p" \
-e "s|^\($s\)\($w\)$s:$s\(.*\)$s$|$fs$fs|p" | \
awk -F$fs '{
indent = length()/2;
key = ;
value = ;
# No prefix or parent for the top level (indent zero)
root_prefix = "'$prefix'_";
if (indent == 0) {
prefix = ""; parent_key = "'$prefix'";
} else {
prefix = root_prefix; parent_key = keys[indent-1];
}
keys[indent] = key;
# remove keys left behind if prior row was indented more than this row
for (i in keys) {if (i > indent) {delete keys[i]}}
# if we have a value
if (length(value) > 0) {
# set values here
# if the "key" is missing, make array indexed, not assoc..
if (length(key) == 0) {
# array item has no key, only a value..
# so, if we didnt already unset the assoc array
if (unsetArray == 0) {
# unset the assoc array here
printf("unset %s%s; ", prefix, parent_key);
# switch the flag, so we only unset once, before adding values
unsetArray = 1;
}
# array was unset, has no key, so add item using indexed array syntax
printf("%s%s+=(\"%s\");\n", prefix, parent_key, value);
} else {
# array item has key and value, add item using assoc array syntax
printf("%s%s[%s]=\"%s\";\n", prefix, parent_key, key, value);
}
} else {
# declare arrays here
# reset this flag for each new array we work on...
unsetArray = 0;
# if item has no key, declare indexed array
if (length(key) == 0) {
# indexed
printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);
# if item has numeric key, declare indexed array
} else if (key ~ /^[[:digit:]]/) {
printf("unset %s%s; declare -g -a %s%s;\n", root_prefix, key, root_prefix, key);
# else (item has a string for a key), declare associative array
} else {
printf("unset %s%s; declare -g -A %s%s;\n", root_prefix, key, root_prefix, key);
}
# set root level values here
if (indent > 0) {
# add to associative array
printf("%s%s[%s]+=\"%s%s\";\n", prefix, parent_key , key, root_prefix, key);
} else {
# add to indexed array
printf("%s%s+=( \"%s%s\");\n", prefix, parent_key , root_prefix, key);
}
}
}'
}
# helper to load yay data file
yay() {
# yaml_to_vars "$@" ## uncomment to debug (prints data to stdout)
eval $(yaml_to_vars "$@")
# yay_parse "$@" ## uncomment to debug (prints data to stdout)
eval $(yay_parse "$@")
}
使用上面的代码,当products.yml
包含:
product1
name: Foo
price: 100
product2
name: Bar
price: 200
解析器可以这样调用:
source path/to/yml-parser.sh
yay products.yml
它生成并计算此代码:
products_product1_name="Foo"
products_product1_price="100"
products_product2_name="Bar"
products_product2_price="200"
unset products;
declare -g -a products;
unset products_product1;
declare -g -A products_product1;
products+=( "products_product1");
products_product1[name]="Foo";
products_product1[price]="100";
unset products_product2;
declare -g -A products_product2;
products+=( "products_product2");
products_product2[name]="Bar";
products_product2[price]="200";
所以,我得到以下 Bash 数组和变量:
declare -a products=([0]="products_product1" [1]="products_product2")
declare -A products_product1=([price]="100" [name]="Foo" )
declare -A products_product2=([price]="200" [name]="Bar" )
在我的模板系统中,我现在可以像这样访问这个 yml 数据:
{{#foreach product in products}}
Name: {{product.name}}
Price: {{product.price}}
{{/foreach}}
:)
另一个例子:
文件site.yml
meta_info:
title: My cool blog
domain: foo.github.io
author1:
name: bob
url: /author/bob
author2:
name: jane
url: /author/jane
header_links:
link1:
title: About
url: about.html
link2:
title: Contact Us
url: contactus.html
js_deps:
cashjs: cashjs
jets: jets
Foo:
- one
- two
- three
产生:
declare -a site=([0]="site_meta_info" [1]="site_author1" [2]="site_author2" [3]="site_header_links" [4]="site_js_deps" [5]="site_Foo")
declare -A site_meta_info=([title]="My cool blog" [domain]="foo.github.io" )
declare -A site_author1=([url]="/author/bob" [name]="bob" )
declare -A site_author2=([url]="/author/jane" [name]="jane" )
declare -A site_header_links=([link1]="site_link1" [link2]="site_link2" )
declare -A site_link1=([url]="about.html" [title]="About" )
declare -A site_link2=([url]="contactus.html" [title]="Contact Us" )
declare -A site_js_deps=([cashjs]="cashjs" [jets]="jets" )
declare -a site_Foo=([0]="one" [1]="two" [2]="three")
在我的模板中,我可以像这样访问 site_header_links
:
{{#foreach link in site_header_links}}
* {{link.title}} - {{link.url}}
{{/foreach}}
和site_Foo
(破折号,或简单的列表)像这样:
{{#site_Foo}}
* {{.}}
{{/site_Foo}}