自定义信息类型和热词规则

Custom info type and hotword rule

正在尝试将 customInfoType 与 hotwordRule 结合使用。配置如下所示(取自 nodeJS 实现):

自定义信息类型:

const customConfig = [{
    infoType: {
      name: 'CZECH_ID'
    },
    regex: {
      pattern: /[0-9]{2,6}-?[0-9]{2,10}\/[0-9]{4}/
    },
    likelihood: 'POSSIBLE'
  }];

自定义规则集:

const customRuleSet = [{
    infoTypes: [{ name: 'CZECH_ID' }],
    rules: [
      {
        hotwordRule: {
          hotwordRegex: {
            pattern: /^CZID$/
          }
        },
        proximity: {
          windowBefore: 10,
          windowAfter: 0
        }
      }
    ]
  }]

这里是 inspectConfig:

const request = {
    parent: `projects/${projectId}/locations/global`,
    inspectConfig: {
      infoTypes: infoTypes,
      customInfoTypes: customConfig,
      ruleSet: customRuleSet,
      minLikelihood: 'POSSIBLE',
      limits: {
        maxFindingsPerRequest: maxFindings,
      },
      includeQuote: true,
    },
    item: item,
  };

当运行我得到:

Error: 3 INVALID_ARGUMENT: `window_before` and `window_after` cannot both be 0.

当我从 运行 配置中删除 customeRuleSet 时,它通过了,但没有识别字符串。所以它必须对 proximity 部分做一些事情,但不确定哪里出了问题。

你的 json 看起来不对劲,你没有将接近度包含在热词规则中。

hotword_rule = {
        "hotword_regex": {"pattern": "/^CZID$/"},
        "likelihood_adjustment": {
            "fixed_likelihood": google.cloud.dlp_v2.Likelihood.VERY_LIKELY
        },
        "proximity": {"window_before": 10},
    }

    rule_set = [
        {"info_types": [{"name": "CZECH_ID"}], "rules": [{"hotword_rule": hotword_rule}]}
    ]

这里有一个python例子

https://cloud.google.com/dlp/docs/creating-custom-infotypes-likelihood#dlp_inspect_hotword_rule-python