Shell 当 运行 通过 ProcessBuilder 时脚本挂起

Shell Scripts hangs when running through ProcessBuilder

我有一个 Java 程序,我在其中触发 shell 脚本。 Java 代码示例是:

 ProcessBuilder pb = new ProcessBuilder(cmdList);
        p = pb.start();
        p.waitFor();

其中 cmdList 包含执行 shell 所需的所有必要输入参数。此 shell 脚本内部有一个 for 循环,并在该循环中执行一些数据库脚本,并在文件中打印结果信息和错误日志。

下面是示例 shell 脚本代码:

#!/bin/bash

export PATH=/apps/PostgresPlus/as9.6/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

设置-eE

################################################## ## 开始

TIME_ELAPSED="" TIME_ELAPSED_IN_HOURS="" SCRIPT_START_TIME_FORMATTED=date '+%F %T' SCRIPT_START_TIME_IN_SEC=date +%s

PROCESS_LOG_BASE_PATH="/data/logs/purge_log/" PROCESS_LOG="$PROCESS_LOG_BASE_PATH/purge.log"

陷阱'err=$?; logError 2>&1 "Error occurred during purging. Exiting with status $err at line $LINENO: ${BASH_COMMAND}. Please check logs for more info." >>$PROCESS_LOG'错误 陷阱'logError 2>&1 "Error occurred during purging. Exiting shell script execution as an external interrupt was received. Please check logs for more info." >>$PROCESS_LOG; trap ERR' 情报

横幅() { echo "+-------------------------------------------- ------------------------------------------------+ “ printf "|tput bold[ %-40s tput sgr0|\n" "$1 ] tput setaf 2 $2" echo "+-------------------------------------------- ------------------------------------------------+ “ }

日志错误() { printf "[ProcessId- $$] [date "+%Y-%m-%d %H:%M:%S"] tput setaf 1 tput bold [错误] tput setaf 1 %-40s tput sgr0\n" "$@" }

日志信息(){ printf "[ProcessId- $$] [date "+%Y-%m-%d %H:%M:%S"] tput setaf 6 bold [INFO] %-40s tput sgr0\n" "$@" } 日志警告(){ printf "[ProcessId- $$] [date "+%Y-%m-%d %H:%M:%S"] tput setaf 3 tput bold [警告] %-40s tput sgr0\n" "$@" }

日志提示(){ printf "[ProcessId- $$] [date "+%Y-%m-%d %H:%M:%S"] tput setaf 5 tput sitm %-40s tput sgr0\n" "$@" }

主要() {
横幅“$SCRIPT_START_TIME_FORMATTED”"Started processing" | tee -a $PROCESS_LOG 日志信息 "Started execution at $SCRIPT_START_TIME_FORMATTED" | tee -a $PROCESS_LOG

set PGPASSWORD=$DB_PASSWORD
export PGPASSWORD=$DB_PASSWORD

# Call DB function for audit and category wise data purging, population of schema names
SCHEMA_NAMES_RESULT=$(psql -h $HOST_NAME -d $DB_NAME -U $DB_USER -p $DB_PORT -At -c "SELECT $COMMON_SCHEMA_NAME.purge_audit_and_populate_schema_names('$COMMON_SCHEMA_NAME', $PURGE_DATA_INTERVAL_IN_DAYS,'$SCHEMA_NAMES',$NUM_TOP_CONTRIBUTING_TENANTS)")

SCHEMA_NAMES_RESULT=$(echo "$SCHEMA_NAMES_RESULT" | sed 's/{//g; s/}//g; s/"//g' )

SCHEMA_NAMES=$(echo $SCHEMA_NAMES_RESULT | rev | cut -d"," -f2-  | rev)

#Convert comma separated string of tenants to array
SCHEMA_NAMES=($(echo "$SCHEMA_NAMES" | tr ',' '\n'))

# loop for multi schema
for element in "${SCHEMA_NAMES[@]}"
do
    logInfo "Effective tenant - $element, Script start time - $SCRIPT_START_TIME_FORMATTED" | tee -a $PROCESS_LOG

    # PGSQL call to DB function to execute purging

    logInfo "Time elapsed since script execution started - $TIME_ELAPSED" | tee -a $PROCESS_LOG
done

#logInfo "Purge completed!" | tee -a $PROCESS_LOG
logInfo "Purge execution completed successfully at `date '+%F %T'`" | tee -a $PROCESS_LOG
exit 0

}

mkdir -p $PROCESS_LOG_BASE_PATH 主“$@”

################################################## ## 结尾

以下是我对这个程序的观察。

  1. 当 运行 shell 脚本直接在 putty 上时,它可以正确执行,没有任何错误。
  2. 当通过上面的 java 程序触发 shell 脚本时,我观察到以下行为。

    一个。它在 for 循环中的某个迭代后挂起。

    b。当我减少 shell 脚本中的日志条目数量时,迭代(for 循环)数量不断增加。

    c。当我删除所有信息日志并继续仅打印错误日志时,它成功完成。

有人可以帮助确定此行为背后的原因吗?

现在,我检查了 for 循环中的迭代次数,但是当我开始接收多个错误日志时,这个问题随时可能发生。

此致

库夏格拉

您必须使用进程流或将 err and out 映射到文件,这样本机缓冲区就不会填满。如果您创建线程来使用每个流,效果会更好。 hacky 单线程版本是这样的:

ProcessBuilder pb = new ProcessBuilder(cmdList);
p = pb.start();
try (InputStream in = p.getInputStream();
            InputStream err = p.getErrorStream();
            OutputStream closeOnly = p.getOutputStream()) {
    while (p.isAlive()) {                
        long skipped = 0L;
        try {
            skipped = in.skip(in.available()) 
                     + err.skip(err.available());
        } catch (IOException jdk8155808) {
           byte[] b = new byte[2048];
           int read = in.read(b, 0, Math.min(b.length, in.available());
           if (read > 0) {
               skipped += read;
           }

           read = err.read(b, 0, Math.min(b.length, err.available());
           if (read > 0) {
               skipped += read;
           }
        }

        if(skipped == 0L) {
           p.waitFor(5L, TimeUnit.MILLISECONDS);
        }
    }
} finally {
   p.destroy();
}

螺纹方式是这样的:

public void foo() {
    class DevNull implements Runnable {
        
        private final InputStream is;
        DevNull(final InputStream is) {
            is = Objects.requireNonNull(is);
        }
        
        public void run() {
            byte[] b = new byte[64];
            try {
                while (is.read(b) >= 0);
            } catch(IOException ignore) {
            }
        }
    }

    ExecutorService e = Executors.newCachedThreadPool();
    ProcessBuilder pb = new ProcessBuilder(cmdList);
    Process p = pb.start();
    try (InputStream in = p.getInputStream();
            InputStream err = p.getErrorStream();
            OutputStream closeOnly = p.getOutputStream()) {
        e.execute(new DevNull(in));
        e.execute(new DevNull(err));
        p.waitFor();
    } finally {
        p.destroy();
        e.shutdown();
    }
}

感谢多线程对我有用。

对于单线程选项,它在 skip() 上失败。

再次感谢您帮助解决问题。