使用 perl 解析 AutoSys JIL

Parsing AutoSys JIL with perl

我有一项任务是解析 AutoSys JIL 文件。这是一个 JIL 作业定义,它是 AUTOSYS 调度程序读入并运行的配置文件。 , 想象一个像这样格式化的文件,其中有数千个像下面这样的作业定义,以完全相同的格式相互堆叠。全部以 header 开头并以时区结尾。

/* ----------------- COME_AND_PLAY_WITH_US_DANNY ----------------- */

insert_job: COME_AND_PLAY_WITH_US_DANNY   job_type: CMD
command: /bin/bash -ls
machine: capser.com
owner: twins
permission: foo,foo
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: "04:00"
description: "Forever, and ever and ever"
std_in_file: "/home/room217"
std_out_file: "${CASPERSYSLOG}/room217.out"
std_err_file: "${CASPERSYSLOG}/room217.err
alarm_if_fail: 1
profile: "/autosys_profile"
timezone: US/Eastern

这是脚本。我需要从上面的作业定义中提取作业、机器和命令。它工作正常,但最终我想将信息存储在某种容器中并发送,而这个脚本在终端中逐行写出结果。现在我正在将结果重定向到一个临时文件。

#!/foo/bar/perl5/core/5.10/exec/bin/perl
use strict;
use warnings;
use File::Basename ;

my($job, $machine, $command)  ;
my $filename = '/tmp/autosys.jil_output.padc';
open(my $fh, '<:encoding(UTF-8)', $filename)
  or die "Could not open file '$filename' $!";
my $count = 0;
while (my $line = <$fh>) {
    #chomp $line;
    if($line =~ /\/\* -{17} \w+ -{17} \*\//) {
    $count = 1; }
    elsif($line =~  /(alarm_if_fail:)/) {
    $count = 0 ; }
    elsif ($count) {
             if ($line =~ m/insert_job: (\w+).*job_type: CMD/) {
             $job =    ;
             }
             elsif($line =~ m/command:(.*)/) {
             $command =   ;
             }
             elsif($line =~ m/machine:(.*)/) {
             $machine =   ;

             print "$job\t $machine\t $command \n ";      
             }
        }


    #sleep 1 ;
   }

我的问题是当我将 print $job, $machine $command 语句放在最后一个 elsif 语句中时,它工作正常。但是,当我将它放在最后一个 elsif 语句的外面时,就像下面的示例一样,输出会一遍又一遍地重复 - 每行在输出中重复四到五次。我不明白这个。为什么我必须将 print 语句放在最后一个 elsif 语句中才能使脚本一次正确地打印一行。

elsif ( $line =~ m/machine:(.*)/ ) {
    $machine = ;
}

print "$job\t $machine\t $command \n ";


重新格式化以上代码以提高可读性

#!/foo/bar/perl5/core/5.10/exec/bin/perl

use strict;
use warnings;

use File::Basename;

my ( $job, $machine, $command );
my $filename = '/tmp/autosys.jil_output.padc';

open( my $fh, '<:encoding(UTF-8)', $filename )
        or die "Could not open file '$filename' $!";

my $count = 0;

while ( my $line = <$fh> ) {

    #chomp $line;
    if ( $line =~ /\/\* -{17} \w+ -{17} \*\// ) {
        $count = 1;
    }
    elsif ( $line =~ /(alarm_if_fail:)/ ) {
        $count = 0;
    }
    elsif ( $count ) {

        if ( $line =~ m/insert_job: (\w+).*job_type: CMD/ ) {
            $job = ;
        }
        elsif ( $line =~ m/command:(.*)/ ) {
            $command = ;
        }
        elsif ( $line =~ m/machine:(.*)/ ) {
            $machine = ;
            print "$job\t $machine\t $command \n ";
        }
    }

    # sleep 1;
}

正如我在评论中所说,请明智地格式化您的代码。如果不这样做,您将让人们要么忽略您的问题,要么像我一样脾气暴躁

  • 假设未识别的文本块只是您输入的一个样本

  • 我们还假设,即使您的代码可以很好地处理示例数据,但实际数据中的某些数据块不起作用

  • 最重要的是,我假设任何包含空格的数据字段值都需要用引号引起来,这会使您的示例 command: /bin/bash -ls 不正确且语法无效

另请确保您已经给出了可运行代码和数据问题的正确示例。如果我根据示例数据执行您显示的代码,那么一切正常,那么您有什么问题?

据我所知,您想显示每个 job_type 字段为 CMD。是吗?

这是我最好的猜测: 是正确的,每次从数据文件中读取一行时,您只是打印收集的所有字段

我的解决方案是将每个数据块转换为哈希。

使用注释来描述块是危险的,而且您没有提供有关字段顺序的信息,因此我不得不假设 insert_job 字段在前。如果要将文件用作命令列表,这是有道理的,但同一行上的附加 job_type 字段很奇怪。这是您数据的真实样本,还是您的示例存在其他问题?

这是我对你的问题的想象的有效解决方案。

#!/foo/bar/perl5/core/5.10/exec/bin/perl

use strict;
use warnings 'all';

my $data = do {
    local $/;
    <DATA>;
};

my @data = grep /:/, split /^(?=insert_job)/m, $data;

for ( @data ) {

    my %data = /(\w+) \s* : \s* (?| " ( [^""]+ ) " | (\S+) )/gx;

    next unless $data{job_type} eq 'CMD';

    print "@data{qw/ insert_job machine command /}\n";
}


__DATA__
/* ----------------- COME_AND_PLAY_WITH_US_DANNY ----------------- */

insert_job: COME_AND_PLAY_WITH_US_DANNY   job_type: CMD
command: /bin/bash -ls
machine: capser.com
owner: twins
permission: foo,foo
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: "04:00"
description: "Forever, and ever and ever"
std_in_file: "/home/room217"
std_out_file: "${CASPERSYSLOG}/room217.out"
std_err_file: "${CASPERSYSLOG}/room217.err
alarm_if_fail: 1
profile: "/autosys_profile"
timezone: US/Eastern

/* ----------------- COME_AND_PLAY_WITH_US_AGAIN_DANNY ----------------- */

insert_job: COME_AND_PLAY_WITH_US_AGAIN_DANNY   job_type: CMD
command: /bin/bash -ls
machine: capser.com
owner: twins
permission: foo,foo
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: "04:00"
description: "Forever, and ever and ever"
std_in_file: "/home/room217"
std_out_file: "${CASPERSYSLOG}/room217.out"
std_err_file: "${CASPERSYSLOG}/room217.err
alarm_if_fail: 1
profile: "/autosys_profile"
timezone: US/Eastern

/* ----------------- NEVER_PLAY_WITH_US_AGAIN_DANNY ----------------- */

insert_job: NEVER_PLAY_WITH_US_AGAIN_DANNY   job_type: CMD
command: /bin/bash -rm *
machine: capser.com
owner: twins
permission: foo,foo
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: "04:00"
description: "Forever, and ever and ever"
std_in_file: "/home/room217"
std_out_file: "${CASPERSYSLOG}/room217.out"
std_err_file: "${CASPERSYSLOG}/room217.err
alarm_if_fail: 1
profile: "/autosys_profile"
timezone: US/Eastern

输出

COME_AND_PLAY_WITH_US_DANNY capser.com /bin/bash
COME_AND_PLAY_WITH_US_AGAIN_DANNY capser.com /bin/bash
NEVER_PLAY_WITH_US_AGAIN_DANNY capser.com /bin/bash

这是一个将 JIL 文件转换成逗号分隔文件的 ksh 解决方案,您可以在 excel

中打开
#!/usr/bin/ksh

# unix scprit to flatten autorep -q

resetVar()
{
    AIF=""
    AD=""
    AH=""
    BF=""
    BN=""
    BS=""
    BT=""
    COM=""
    COD=""
    DC=""
    DOW=""
    DES=""
    EC=""
    IJ=""
    JL=""
    JT=""
    MAC=""
    MES=""
    MRA=""
    NR=""
    OWN=""
    PER=""
    PRI=""
    PRO=""
    RC=""
    RW=""
    SM=""
    ST=""
    SEF=""
    SOF=""
    TRT=""
    WF=""
    WFMS=""
    WI=""
    LSD=""
    LST=""
    LED=""
    LET=""
    STA=""
    RUN=""
}


writePartToFile()
{
 echo "$AIF;$AD;$AH;$BF;$BN;$BS;$BT;$COM;$COD;$DC;$DOW;$DES;$EC;$IJ;$JL;$JT;$MAC;$MES;$MRA;$NR;$OWN;$PER;$PRI;$PRO;$RC;$RW;$SM;$ST;$SEF;$SOF;$TRT;$WF;$WFMS;$WI" >> $TO_TPM
 #echo "$AIF;$AD;$AH;$BF;$BN;$BS;$BT;$COM;$COD;$DC;$DOW;$DES;$EC;$IJ;$JL;$JT;$MAC;$MES;$MRA;$NR;$OWN;$PER;$PRI;$PRO;$RC;$RW;$SM;$ST;$SEF;$SOF;$TRT;$WF;$WFMS;$WI" 
 resetVar

}

JOB_NAME="flatten JIL"
part1=""
part2=""


#---------------------------------
if test "." = "."
then
   echo "Missing first parameter (jil file to flatten)"; 
   exit 1;
fi

if test "." = "."
then
   echo "Missing second parameter (resulting flat file)";
 exit 1;
fi

TO_FLATTEN=
TO_RESULT=
CLE_FILE="lesCles"
CLE_TMP="lesClesTmp"
TO_TPM="tempFichier"
TO_STATUS="statusFichier"

rm $TO_RESULT
rm $CLE_TMP
rm $CLE_FILE
rm $TO_TPM
rm $TO_STATUS

echo 'alarm_if_fail;auto_delete;auto_hold;box_failure;box_name;box_success;box_terminator;command;condition;date_conditions;days_of_week;description;exclude_calendar;insert_job;job_load;job_terminator;machine;max_exit_success;max_run_alarm;n_retrys;owner;permission;priority;profile;run_calendar;run_window;start_mins;start_times;std_err_file;std_out_file;term_run_time;watch_file;watch_file_min_size;watch_interval;last_start_date;last_start_time;last_end_date;last_end_time;status;run' >> $TO_RESULT;
 while read line; do    
    if test "${line#*:}" != "$line"
    then        
      cle="$(echo "$line" | cut -d":" -f 1)"
      #echo "cle = $cle"
      part2="$(echo "$line" | cut -d":" -f 2)"   
      #echo "part2 = $part2"        
      val="$(echo "$part2" | cut -d" " -f 2)"
      #echo "val = $val"    
    fi  
    if test "$cle" = "insert_job"
    then
    #on n'est sur la premiere ligne
        if test "$IJ." = "."
        then
            ;                           
        else          
            if test "$BN." = "."
            then             
             echo $IJ >> $CLE_TMP
            else
             echo $BN >> $CLE_TMP
            fi      
            writePartToFile         
        fi
        IJ=$val
        JT="$(echo "$line" | cut -d":" -f 3)"                   
    else    
    #on n est pas sur le premiere ligne 
        val=$part2
        case $cle in
            alarm_if_fail) AIF=$val;;
            auto_delete) AD=$val;;
            auto_hold) AH=$val;;
            box_failure) BF=$val;;
            box_name) BN=$val;;
            box_success) BS=$val;;
            box_terminator) BT=$val;;
            command) COM=$val;;
            condition) COD=$val;;
            date_conditions) DC=$val;;
            days_of_week) DOW=$val;;
            description) DES=$val;;
            exclude_calendar) EC=$val;;
            insert_job) IJ=$val;;
            job_load) JL=$val;;
            job_terminator) JT=$val;;
            machine) MAC=$val;;
            max_exit_success) MES=$val;;
            max_run_alarm) MRA==$val;;
            n_retrys) NR=$val;;
            '#owner') OWN=$val;;
            permission) PER=$val;;
            priority) PRI=$val;;
            profile) PRO=$val;;
            run_calendar) RC=$val;;
            run_window) RW=$val;;
            start_mins) SM=$val;;
            start_times) ST=$val;;
            std_err_file) SEF=$val;;
            std_out_file) SOF=$val;;
            term_run_time) TRT=$val;;
            watch_file) WF=$val;;
            watch_file_min_size) WFMS=$val;;
            watch_interval) WI=$val;; 
        esac        
    fi

done  < $TO_FLATTEN;
#Traiter derniere occurence
if test "$BN." = "."
then
    echo $IJ >> $CLE_TMP
else
    echo $BN >> $CLE_TMP
fi      
writePartToFile     

echo "Les cles"
cat $CLE_TMP | sort | uniq > $CLE_FILE
cat $CLE_FILE
rm $CLE_TMP

#------------------------------
 while read line; do        
    autorep -J ${line} -w  >> $TO_STATUS;   
done  < $CLE_FILE;
#----------------------------------------
echo " Resultats"
while read line; do
unJob="$(echo "$line" | cut -d";" -f 14)"
details="$(grep -w  "$unJob" "$TO_STATUS" | head -n 1)" 
LSD="$(echo "$details" | awk '{print }')"
if test "$LSD" = "-----"
then
    LST=""
    LED="$(echo "$details" | awk '{print }')"
    if test "$LED" = "-----"
    then
        LET=""
        STA="$(echo "$details" | awk '{print }')"
        RUN="$(echo "$details" | awk '{print }')"
    else
        LET="$(echo "$details" | awk '{print }')"
        STA="$(echo "$details" | awk '{print }')"
        RUN="$(echo "$details" | awk '{print }')"
    fi
else
    LST="$(echo "$details" | awk '{print }')"
    LED="$(echo "$details" | awk '{print }')"
    if test "$LED" = "-----"
    then
        LET=""
        STA="$(echo "$details" | awk '{print }')"
        RUN="$(echo "$details" | awk '{print }')"
    else
        LET="$(echo "$details" | awk '{print }')"
        STA="$(echo "$details" | awk '{print }')"
        RUN="$(echo "$details" | awk '{print }')"
    fi
fi

echo " ligne= ${line};${LSD};${LST};${LED};${LET};${STA};${RUN}"
echo "${line};${LSD};${LST};${LED};${LET};${STA};${RUN}" >> $TO_RESULT
resetVar
done  < $TO_TPM;