Google DLP 用户定义的敏感数据输出
Google DLP userdefined output for sensitive data
我将 Google DLP 的请求正文作为文本值。有什么方法可以配置用户定义的 RedactConfig 来修改输出……?有什么办法可以实现吗?
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"primitiveTransformation":{
"replaceWithInfoTypeConfig":{
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
}
}
有什么方法可以配置用户定义的 RedactConfig 来修改输出..?
我需要来自 Google DLP 的以下 O/P。
{
"item": {
"value": "My name is Alicia Abernathy, and my email address is {{__aabernathy@example.com__[EMAIL_ADDRESS]__}}."
},
"overview": {
"transformedBytes": "22",
"transformationSummaries": [
{
"infoType": {
"name": "EMAIL_ADDRESS"
},
"transformation": {
"replaceWithInfoTypeConfig": {}
},
"results": [
{
"count": "1",
"code": "SUCCESS"
}
],
"transformedBytes": "22"
}
]
}
}
所以您实际上并不想匿名化文本,您只是想向其中添加信息? API 不适合那个……你最好的选择是只使用 inspectContent 并根据结果中的字节偏移量进行你自己的转换。
类似这样的伪代码...
private static final void labelStringWithFindings(
字符串 stringToLabel,
InspectContentResponse dlpResponse) {
StringBuilder 输出 = new StringBuilder();
最终字节 [] messageBytes = ByteString.copyFromUtf8(
stringToLabel).toByteArray();
不可变列表 sortedFindings =
排序(dlpResponse.getResult().getFindingsList());
int lastEnd = 0;
for (Finding finding : sortedFindings) {
String quote = Ascii.toLowerCase(finding.getQuote());
String infoType = finding.getInfoType().getName();
String surrogate = String.format("{{__%s__[%s]__}}",
quote, infoType);
final byte[] surrogateBytes = surrogate.getBytes(StandardCharsets.UTF_8);
int startIndex = (int) finding.getLocation().getByteRange().getStart();
int endIndex = (int) finding.getLocation().getByteRange().getEnd();
if (lastEnd == 0 || startIndex > lastEnd) {
output.write(messageBytes, lastEnd, startIndex - lastEnd);
output.write(surrogateBytes, 0, surrogate.length);
}
if (endIndex > lastEnd) {
lastEnd = endIndex;
}
}
if (messageBytes.length > lastEnd) {
output.write(messageBytes, lastEnd, messageBytes.length - lastEnd);
}
return output.toString();
}
我将 Google DLP 的请求正文作为文本值。有什么方法可以配置用户定义的 RedactConfig 来修改输出……?有什么办法可以实现吗?
{
"item":{
"value":"My name is Alicia Abernathy, and my email address is aabernathy@example.com."
},
"deidentifyConfig":{
"infoTypeTransformations":{
"transformations":[
{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
],
"primitiveTransformation":{
"replaceWithInfoTypeConfig":{
}
}
}
]
}
},
"inspectConfig":{
"infoTypes":[
{
"name":"EMAIL_ADDRESS"
}
]
}
}
有什么方法可以配置用户定义的 RedactConfig 来修改输出..?
我需要来自 Google DLP 的以下 O/P。
{
"item": {
"value": "My name is Alicia Abernathy, and my email address is {{__aabernathy@example.com__[EMAIL_ADDRESS]__}}."
},
"overview": {
"transformedBytes": "22",
"transformationSummaries": [
{
"infoType": {
"name": "EMAIL_ADDRESS"
},
"transformation": {
"replaceWithInfoTypeConfig": {}
},
"results": [
{
"count": "1",
"code": "SUCCESS"
}
],
"transformedBytes": "22"
}
]
}
}
所以您实际上并不想匿名化文本,您只是想向其中添加信息? API 不适合那个……你最好的选择是只使用 inspectContent 并根据结果中的字节偏移量进行你自己的转换。
类似这样的伪代码...
private static final void labelStringWithFindings( 字符串 stringToLabel, InspectContentResponse dlpResponse) { StringBuilder 输出 = new StringBuilder(); 最终字节 [] messageBytes = ByteString.copyFromUtf8( stringToLabel).toByteArray(); 不可变列表 sortedFindings = 排序(dlpResponse.getResult().getFindingsList());
int lastEnd = 0;
for (Finding finding : sortedFindings) {
String quote = Ascii.toLowerCase(finding.getQuote());
String infoType = finding.getInfoType().getName();
String surrogate = String.format("{{__%s__[%s]__}}",
quote, infoType);
final byte[] surrogateBytes = surrogate.getBytes(StandardCharsets.UTF_8);
int startIndex = (int) finding.getLocation().getByteRange().getStart();
int endIndex = (int) finding.getLocation().getByteRange().getEnd();
if (lastEnd == 0 || startIndex > lastEnd) {
output.write(messageBytes, lastEnd, startIndex - lastEnd);
output.write(surrogateBytes, 0, surrogate.length);
}
if (endIndex > lastEnd) {
lastEnd = endIndex;
}
}
if (messageBytes.length > lastEnd) {
output.write(messageBytes, lastEnd, messageBytes.length - lastEnd);
}
return output.toString();
}