将 JavaScript 正则表达式转换为 JSON 格式
Convert a JavaScript RegEx into JSON format
我目前正在开发一个 Safari 扩展,它将利用 Safari 9 中可用的新 webkit-content-blocker 功能。现在,此类拦截器的规则需要写在 JSON 中。
我即将扩展的后台脚本生成了这样的 JSON 规则。我遇到的问题是我无法正确格式化正则表达式,其作用是过滤 URLs,以兼容 JSON。
假设我需要屏蔽 URL 包含 "banana"、"orange" 或 "apple" 的所有图像。我的正则表达式是这样的:
var url-filter = /banana|orange|apple/g;
现在 JSON 中的阻止程序规则,缺少 url 过滤部分:
"action": {
"type": "block"
},
"trigger": {
"url-filter": <JSON regex here>,
"resource-type": ["image"],
"load-type": ["third-party"]
}
[更新]
如何将我的正则表达式重写为 JSON compatible/ready,知道不支持交替?
The Regular expression format
Triggers support filtering the URLs of each resource based on regular expression.
The following features are supported:
- Matching any character with “.”.
- Matching ranges with the range syntax [a-b].
- Quantifying expressions with “?”, “+” and “*”.
- Groups with parenthesis.
It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error.
[更新 BIS]
考虑到 Safari 执行的严格的 CSP 策略以及不支持交替,我最终将我的原始正则表达式转换为数组,然后通过循环动态生成 JSON 规则。
var regex = 'banana|orange|apple',
filters = regex.split('|'),
json_rules = [];
var Blocker = {
build: function() {
filters.forEach( function(filter) {
var rule = {
action: {
'type': 'block'
},
trigger: {
'url-filter': filter,
'resource-type': ['image'],
'load-type': ['third-party']
}
};
json_rules.push(rule);
});
Blocker.set(JSON.stringify(json_rules));
},
init: function() {
Blocker.build();
},
set: function (rule) {
safari.extension.setContentBlocker(rule);
}
};
根据您链接的文档,过滤器的值被视为正则表达式(例如,它们显示 "url-filter": "evil-tracker\.js"
和 "url-filter": ".*"
)。
文档还说 url-filter
不区分大小写,因此您不必担心可能要使用的 i
标志。但是如果你想要一个区分大小写的,你会添加 "url-filter-is-case-sensitive": true
.
在这种情况下,您只需将正则表达式放在引号中,确保在字符串文字中转义任何需要转义的字符(例如,请注意他们如何在 "evil-tracker\.js"
字符串,以便正则表达式为 evil-tracker\.js
).
但是:你这个表达式的问题是不支持交替。同样,根据您链接的文档:
The format is a strict subset of JavaScript regular expressions. Syntactically, everything supported by JavaScript is reserved but only a subset will be accepted by the parser. An unsupported expression results in a parse error.
The following features are supported:
- Matching any character with “.”.
- Matching ranges with the range syntax [a-b].
- Quantifying expressions with “?”, “+” and “*”.
- Groups with parenthesis.
It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error.
请注意,他们不接受 |
(交替)。
这告诉我您需要三个规则:一个用于 banana
,一个用于 orange
,一个用于 apple
。
我目前正在开发一个 Safari 扩展,它将利用 Safari 9 中可用的新 webkit-content-blocker 功能。现在,此类拦截器的规则需要写在 JSON 中。
我即将扩展的后台脚本生成了这样的 JSON 规则。我遇到的问题是我无法正确格式化正则表达式,其作用是过滤 URLs,以兼容 JSON。
假设我需要屏蔽 URL 包含 "banana"、"orange" 或 "apple" 的所有图像。我的正则表达式是这样的:
var url-filter = /banana|orange|apple/g;
现在 JSON 中的阻止程序规则,缺少 url 过滤部分:
"action": {
"type": "block"
},
"trigger": {
"url-filter": <JSON regex here>,
"resource-type": ["image"],
"load-type": ["third-party"]
}
[更新]
如何将我的正则表达式重写为 JSON compatible/ready,知道不支持交替?
The Regular expression format
Triggers support filtering the URLs of each resource based on regular expression.
The following features are supported:
- Matching any character with “.”.
- Matching ranges with the range syntax [a-b].
- Quantifying expressions with “?”, “+” and “*”.
- Groups with parenthesis.
It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error.
[更新 BIS]
考虑到 Safari 执行的严格的 CSP 策略以及不支持交替,我最终将我的原始正则表达式转换为数组,然后通过循环动态生成 JSON 规则。
var regex = 'banana|orange|apple',
filters = regex.split('|'),
json_rules = [];
var Blocker = {
build: function() {
filters.forEach( function(filter) {
var rule = {
action: {
'type': 'block'
},
trigger: {
'url-filter': filter,
'resource-type': ['image'],
'load-type': ['third-party']
}
};
json_rules.push(rule);
});
Blocker.set(JSON.stringify(json_rules));
},
init: function() {
Blocker.build();
},
set: function (rule) {
safari.extension.setContentBlocker(rule);
}
};
根据您链接的文档,过滤器的值被视为正则表达式(例如,它们显示 "url-filter": "evil-tracker\.js"
和 "url-filter": ".*"
)。
文档还说 url-filter
不区分大小写,因此您不必担心可能要使用的 i
标志。但是如果你想要一个区分大小写的,你会添加 "url-filter-is-case-sensitive": true
.
在这种情况下,您只需将正则表达式放在引号中,确保在字符串文字中转义任何需要转义的字符(例如,请注意他们如何在 "evil-tracker\.js"
字符串,以便正则表达式为 evil-tracker\.js
).
但是:你这个表达式的问题是不支持交替。同样,根据您链接的文档:
The format is a strict subset of JavaScript regular expressions. Syntactically, everything supported by JavaScript is reserved but only a subset will be accepted by the parser. An unsupported expression results in a parse error.
The following features are supported:
- Matching any character with “.”.
- Matching ranges with the range syntax [a-b].
- Quantifying expressions with “?”, “+” and “*”.
- Groups with parenthesis.
It is possible to use the beginning of line (“^”) and end of line (“$”) marker but they are restricted to be the first and last character of the expression. For example, a pattern like “^bar$” is perfectly valid, while “(foo)?^bar$” causes a syntax error.
请注意,他们不接受 |
(交替)。
这告诉我您需要三个规则:一个用于 banana
,一个用于 orange
,一个用于 apple
。