在awk中将中文翻译成urlencoding

Question

我有一个 .txt 文件。而且每一行都有中文。我想把中文翻译成urlencoding

如何获取？

txt.file

http://wiki.com/    中文
http://wiki.com/    中国

target.file

http://wiki.com/%E4%B8%AD%E6%96%87
http://wiki.com/%E4%B8%AD%E5%9B%BD

我找到了一个 shell 脚本方法来处理它：

echo '中文' | tr -d '\n' | xxd -plain | sed 's/\(..\)/%/g' | tr '[a-z]' '[A-Z]'

所以，我想像这样将它嵌入 awk，但我失败了：

awk -F'\t' '{
    a=system("echo '""'| tr -d '\n' | xxd -plain | \
    sed 's/\(..\)/%/g' | tr '[a-z]' '[A-Z]");

    print a
}' txt.file

我试过另一种方法写一个外部函数并在awk中调用它，像这样的代码，又失败了。

zh2url()
{
   echo  | tr -d '\n' | xxd -plain | sed 's/\(..\)/%/g' | tr '[a-z]' '[A-Z]'
}
export -f zh2url
awk -F'\t' "{a=system(\"zh2url \");print a}" txt.file

请用awk命令实现，因为我实际上还有一件事需要同时在awk中处理。

Answer 1

使用 GNU awk 进行协同处理等：

$ cat tst.awk
function xlate(old,     cmd, new) {
    cmd = "xxd -plain"
    printf "%s", old |& cmd
    close(cmd,"to")
    if ( (cmd |& getline rslt) > 0 ) {
        new = toupper(gensub(/../,"%&","g",rslt))
    }
    close(cmd)
    return new
}
BEGIN { FS="\t" }
{ print  xlate() }

$ awk -f tst.awk txt.file
http://wiki.com/%E4%B8%AD%E6%96%87
http://wiki.com/%E4%B8%AD%E5%9B%BD

在awk中将中文翻译成urlencoding

Translate Chinese to urlencoding in awk

linux

shell

awk

url-encoding