bash 或 zsh：如何将多个输入传递给交互式管道参数？

Question

我有 3 个不同的文件要比较

words_freq words_freq_deduped words_freq_alpha

对于每个文件，我运行一个这样的命令，我不断迭代以比较结果。

例如，我会这样做：

$ cat words_freq | grep -v '[soe]'
$ cat words_freq_deduped | grep -v '[soe]'
$ cat words_freq_alpha | grep -v '[soe]'

然后查看结果，然后再做一次，使用额外的过滤器

$ cat words_freq | grep -v '[soe]' | grep a | grep r | head -n20
a

$ cat words_freq_deduped | grep -v '[soe]' | grep a | grep r | head -n20
b

$ cat words_freq_alpha | grep -v '[soe]' | grep a | grep r | head -n20
c

这一直持续到我分析完我的数据。

我想编写一个脚本，可以获取管道部分，并将其传递给这些文件中的每一个，因为我迭代命令的 grep/head 部分。

例如以下将转储运行上面 3 个命令的结果并比较这 3 个结果，并转储对它们的额外计算

$ myScript | grep -v '[soe]' | grep a | grep r | head -n20
the letters were in all 3 runs, and it took 5 seconds
a
b
c

如何使用 bash/python 或 zsh 来完成 myScript 部分？

编辑：在问完这个问题后，我想到我可以使用 eval 来做到这一点，就像这样 ，

以下方法允许我使用 eval 处理多个文件，我知道这是不受欢迎的 - 非常感谢任何其他建议！

$ myScript "grep -v '[soe]' | grep a | grep r | head -n20"

我的脚本

#!/usr/bin/env bash
function doIt(){
  FILE=
  CMD="cat  | "
  echo processing file "$FILE"
  eval "$CMD"
  echo
}

doIt words_freq "$@" 
doIt words_freq_deduped "$@" 
doIt words_freq_alpha "$@"

Answer 1

以下方法允许我使用 eval 处理多个文件，我知道这是不受欢迎的 - 非常感谢任何其他建议！

$ myScript "grep -v '[soe]' | grep a | grep r | head -n20"

我的脚本

#!/usr/bin/env bash
function doIt(){
  FILE=
  CMD="cat  | "
  echo processing file "$FILE"
  eval "$CMD"
  echo
}

doIt words_freq "$@" 
doIt words_freq_deduped "$@" 
doIt words_freq_alpha "$@"

Answer 2

你无法避免你的 shell 来自运行ning 管道本身，所以像那样使用它不是很实用 - 你需要引用所有内容然后评估它，这将使传递带有空格的参数或引用每个管道变得困难，然后您可以对其进行评估，因此您必须引用每个管道。但是，是的，这些解决方案有点老套。

我建议执行以下两个操作之一：

保持你的编辑器打开，并把你想要的任何东西运行放在你运行之前的 doIt 函数本身。然后运行它在你的 shell 中没有任何参数：

#!/usr/bin/env bash

doIt() {
  # grep -v '[soe]' < ""
  grep -v '[soe]' < "" | grep a | grep r | head -n20
}

doIt words_freq
doIt words_freq_deduped
doIt words_freq_alpha

或者，您始终可以在 shell 中使用“for”，您可以在需要时使用 Ctrl+r 在历史记录中查找：

$ for f in words_freq*; do grep -v '[soe]' < "$f" | grep a | grep r | head -n20; done

但是如果你真的想要你的方法，我试着让它接受空格，但它最终变得更加骇人听闻：

#!/usr/bin/env bash

doIt() {
  local FILE=
  shift
  echo processing file "$FILE"
  local args=()

  for n in $(seq 1 $#); do
    arg=
    shift
    if [[ $arg == '|' ]]; then
      args+=('|')
    else
      args+=("\"$arg\"")
    fi
  done
  eval "cat '$FILE' | ${args[@]}"
}

doIt words_freq "$@" 
doIt words_freq_deduped "$@" 
doIt words_freq_alpha "$@"

有了这个版本，你可以像这样使用它：

$ ./myScript grep "a a" "|" head -n1

请注意，它需要您引用 |，并且它现在可以处理带空格的参数。

Answer 3

没有完全正确理解问题。

我了解到您想编写一个没有管道的脚本，方法是将过滤逻辑包含到脚本中。并将过滤模式作为参数提供。

这是一个gawk脚本（标准Linuxawk）。

一次扫描 3 个输入文件，没有管道。

script.awk

BEGIN {
  RS="!@!@!@!@!@!@!@"; 
  # set record separator to something unlikely matched, causing each file to be read entirely as a single record
}
[=10=] !~ excludeRegEx      # if file does not match excludeRegEx
&& [=10=] ~ includeRegEx1   # and match includeRegEx1
&& [=10=] ~ includeRegEx2 { # and match includeRegEx2
  system "head -n20 "FILENAME; # call shell command "head -n20 " on current filename
}

运行 script.awk

   awk -v excludeRegEx='[soe]' \
       -v includeRegEx1='a' \
       -v includeRegEx2='r' \
       -f script.awk words_freq words_freq_deduped words_freq_alpha

bash 或 zsh：如何将多个输入传递给交互式管道参数？

bash or zsh: how to pass multiple inputs to interactive piped parameters?

bash

awk

grep

zsh

我的脚本

我的脚本

script.awk

运行 script.awk