解码多次编码的字符串

Decode multiple times encoded String

我已经编写了 Java 代码来解码用 "UTF-8" 编码的字符串。该字符串被编码了三次。我在 ETL 中使用此代码。所以,我可以连续三次使用一个ETL步骤,但效率会有点低。我通过互联网进行了研究,但没有发现任何有希望的东西。 Java有什么办法可以解码多次编码的String吗?

这是我的输入字符串 "uri":

file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development+Activity/Defects+Unresolved+%252528by+Non-Developer%252529.xanalyzer

这是我解码这个字符串的代码:

import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.io.*;

String decodedValue;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException {
// First, get a row from the default input hop
//
Object[] r = getRow();
// If the row object is null, we are done processing.
//
if (r == null) {
    setOutputDone();
    return false;
}

// It is always safest to call createOutputRow() to ensure that your output row's Object[] is large
// enough to handle any new fields you are creating in this step.
//
Object[] outputRow = createOutputRow(r, data.outputRowMeta.size());

String newFileName = get(Fields.In, "uri").getString(r);

try{
    decodedValue = URLDecoder.decode(newFileName, "UTF-8");
}
catch (UnsupportedEncodingException e) {
throw new AssertionError("UTF-8 is unknown");
}
// Set the value in the output field
//
get(Fields.Out, "decodedValue").setValue(outputRow, decodedValue);

// putRow will send the row on to the default output hop.
//
putRow(data.outputRowMeta, outputRow);

return true;}

此代码的输出如下:

file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development Activity/Defects Unresolved %2528by Non-Developer%2529.xanalyzer

当我在 ETL 中 运行 这段代码三次时,我得到了我想要的输出,即:

file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development Activity/Defects Unresolved (by Non-Developer).xanalyzer

只需一个 for 循环即可完成工作:

String newFileName = get(Fields.In, "uri").getString(r);
decodedValue = newFileName;
for (int i=0; i<=3; i++){

try{
    decodedValue = URLDecoder.decode(decodedValue, "UTF-8");
}
catch (UnsupportedEncodingException e) {
throw new AssertionError("UTF-8 is unknown");
}
}

URL编码分别用%()替换。 %25.%28%29.

String s = "file:///C:/Users/nikhil.karkare/dev/pentaho/data/"
    + "ba-repo-content-original/public/Development+Activity/"
    + "Defects+Unresolved+%252528by+Non-Developer%252529.xanalyzer";

// %253528 ... %252529
s = URLDecoder.decode(s, "UTF-8");
// %2528 ... %2529
s = URLDecoder.decode(s, "UTF-8");
// %28 .. %29
s = URLDecoder.decode(s, "UTF-8");
// ( ... )