在 AWK 中去除开头的空白和尾随的空白

Question

我请求您协助删除每个字段之前和末尾的 blanks/spaces。即从 $1 中删除尾随 space，这同样适用于 $2 中的开头和尾随 space，以及在 AIX 7.2 平台上使用 AWK 从 $3 中删除前导 space。下面是文件 Employee.txt

中的一些数据

001 |  George John Aden Brown   | gbrown
002 |   Barry Street White      | bwhite
003 |    Kelly Jones            | kjones
004 |   Jolene Davidson Smith   | jsmith

我的objective是实现下面这组数据（没有leading/trailingspaces）

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

我尝试了以下方法，但都不满意。

awk -F"|" '{ print  "|" gsub(" ", "", ) "|"  }' Employee.txt
awk -F"|" '{ print  "|" gsub(/[ \t]/,"",) "|"  }'  Employee.txt
awk -F"|" '{ print  "|" gsub(/[[:blank:]]/, "", ) "|"  }' Employee.txt

001 |8| gbrown
002 |11| bwhite
003 |17| kjones
004 |8| jsmith

Answer 1

使用您显示的示例，请尝试以下 awk 代码。在 GNU awk 中编写和测试，应该在任何 awk 中工作。简单的解释是，对于 Input_file 的所有行，将字段分隔符设置为 [[:space:]]+\|[[:space:]]+（空格后跟竖线，然后是空格），然后将所有行的 OFS 设置为 | .在主程序中，然后将 </code> 重置为自身以实际将 <code>OFS 的新值应用于整行，完成后，通过提及 1.

简单地打印该行

awk -v FS='[[:space:]]+\|[[:space:]]+' -v OFS='|' '{=} 1'  Input_file

Answer 2

我通常 - 而且很多：

$ awk '
BEGIN {
    FS=OFS="|"                 # set both separators to pipe
}
{
    for(i=1;i<=NF;i++)         # loop all fields
        gsub(/^ +| +$/,"",$i)  # strip leading and trailing space
}1' file                       # output

输出：

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

如果您还有其他垃圾，请随时调整正则表达式：

gsub(/^"?[ \t]*(N\/A)?|[ \t]*"?$/),"",$i)  # etc

Answer 3

你的 awk 答案很好。但是，如果你想考虑 sed 这很简单：

sed -E 's/ *(\|) *|^ +| +$//g' file

001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

否则 gnu-awk:

awk '{print gensub(/ *(\|) *|^ +| +$/, "\1", "g")}' file

PS：此 sed 命令需要 GNU 或 BSD 版本。

Answer 4

还有awk

awk '{=;gsub(/ \| /,"|")} 1' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

The stripping of leading and trailing whitespace also comes into play whenever [=12=] is recomputed.（参见：http://gnu.ist.utl.pt/software/gawk/manual/html_node/Regexp-Field-Splitting.html）
= </code> 赋值给 <code> 重建 [=16=]。现在我们有一个新的 [=16=] 没有前导和尾随的白色 space.
然后我们将 [=16=] 应用于 gsub() 函数：regexp / \| / for space followed by | character followed by space.这被替换为 | 字符。

Answer 5

如果正如您在问题中所说，您不想删除行中的前导或尾随空格，则使用任何 sed：

$ sed 's/ *| */|/g' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

否则，如果您确实希望删除 leading/trailing 空格，则使用 GNU 或 BSD sed for -E:

$ sed -E 's/(^| *)\|( *|$)/|/g' file
001|George John Aden Brown|gbrown
002|Barry Street White|bwhite
003|Kelly Jones|kjones
004|Jolene Davidson Smith|jsmith

在 AWK 中去除开头的空白和尾随的空白

Strip blanks at the beginning & trailing blanks in AWK

unix

awk