在分层目录结构中使用 bash 循环和 AWK 计算和提取结果的脚本

Question

我有以下 directory structure with certain files of interest, on which I have to do calculation/ arithmetic operations using awk.

$ mkdir DP1/postProcessing/0/ DP2/postProcessing/0/ DP3/postProcessing/0/;
$ touch DP1/postProcessing/0/wallShearStress.dat DP1/postProcessing/0/wallShearStress_0.02.dat DP2/postProcessing/0/wallShearStress_0.dat DP2/postProcessing/0/wallShearStress_0.1.dat DP3/postProcessing/0/wallShearStress_0.05.dat DP3/postProcessing/0/wallShearStress_0.000012.dat
masterDir/;

$ tree masterDir/
masterDir/
├── DP1
│   └── postProcessing
│       └── 0
│           ├── wallShearStress_0.02.dat
│           └── wallShearStress.dat
├── DP2
│   └── postProcessing
│       └── 0
│           ├── wallShearStress_0.1.dat
│           └── wallShearStress_0.dat
└── DP3
    └── postProcessing
        └── 0
            ├── wallShearStress_0.000012.dat
            ├── wallShearStress_0.05.dat
            └── wallShearStress.dat

预期输出

DP     File_processed               Ouput_value #Optional header
DP1    wallShearStress_0.02.dat          <some result using AWK>  
DP2    wallShearStress_0.1.dat        <some result using AWK>  
DP3    wallShearStress_0.05.dat     <some result using AWK>

我的（非常基本的）尝试失败了，脚本只 returns 为找到的最后一个目录文件三次：

$ for i in $(find -type d -name "DP*"); do
>     for j in $(find . -type f -name "wallShearStress*" | tail -n 1); do
>         echo $j;
>         awk 'NR == 3 {print [=13=]}' $j; # this just for example ...
>         # but I wanna do something more here, but no issue with that
>         # once I can get the proper files into AWK.
>     done;
> done;
./DP3/postProcessing/0/wallShearStress_0.05.dat
./DP3/postProcessing/0/wallShearStress_0.05.dat
./DP3/postProcessing/0/wallShearStress_0.05.dat

问题定义：我要，

首先，在每个目录中找到名为wallShearStress*.dat的文件。其中，
感兴趣的文件应该以最大数字结尾。（需要说明的是，目录中存在多个 wallShearStress*.dat 文件，例如，对于 DP3，仅应选择 DP3\postProcessing[=17=]\wallShearStress_0.05.dat 进行处理，因为它的优先级高于 DP3\postProcessing[=18=]\wallShearStress.dat，同样仅应选择 DP1\postProcessing[=19=]\wallShearStress_0.02.dat 和 DP2\postProcessing[=20=]\wallShearStress_0.1.dat）
用awk对选择的wallShearStress*.dat进行算术运算，对每个目录在masterDir中输出为.txt/.csv文件如下：

问题

这种方法有什么问题？
有什么更好的方法吗？（请记住，问题在于获取正确的文件，而不是 AWK）。

我更喜欢 bash + awk（因为与其他人想出的其他编程语言相比，它对我来说更容易理解）。非常感谢您的参与！

Answer 1

您可以只对父目录使用 for 循环，对子目录使用 find。如果您的 sort 有 -V 标志，请使用它。

#!/usr/bin/env bash

for d in masterDir/DP*/; do
  find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1
done

要遍历输出，您可以使用 while read 循环。

#!/usr/bin/env bash

while IFS= read -r files; do
  echo Do something with "$files"
done < <(for d in masterDir/DP*/; do find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1; done )

OP 要求的另一个选项

#!/usr/bin/env bash

for d in masterDir/DP*/; do
  while IFS= read -r files; do
    echo Do something with "$files"
  done < <(find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1)
done

-t, --field-separator=SEP use SEP instead of non-blank to blank transition 使用 . 作为字段分隔符进行排序。
<() 是 Process Substitution，它是某种文件，确切地说是命名管道，请参阅 ls -l <(:) 的输出，并按顺序要从文件中读取你需要 < 重定向符号，它需要与 <( ) 分开，否则你会得到一个错误。

在分层目录结构中使用 bash 循环和 AWK 计算和提取结果的脚本

Script to calculate and extract results using bash loops and AWK in a hierarchical directory structure

directory

bash

shell

awk

openfoam