Commons Net FTPClient 与 Mule 无限期挂起

Commons Net FTPClient hangs indefinitely with Mule

我遇到了 Mule ESB FTP 传输的问题:轮询时,线程 运行 客户端将无限期挂起而不会抛出错误。这会导致 FTP 轮询完全停止。 Mule 使用 Apache Commons Net FTPClient.

进一步查看代码,我认为这是由于未设置 FTPClient 的 SocketTimeout 引起的,有时在从 FTPClient 的套接字读取行时导致无限挂起。

我们在问题发生时用jstack检索到的这些栈中可以清楚的看到问题所在。 __getReply() 函数似乎更直接 link 解决问题。

创建新的 FTPClient 时挂在 connect() 调用上:

receiver.172 prio=10 tid=0x00007f23e43c8800 nid=0x2d5 runnable [0x00007f24c32f1000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    - locked <0x00000007817a9578> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    - locked <0x00000007817a9578> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:294)
    at org.apache.commons.net.ftp.FTP._connectAction_(FTP.java:364)
    at org.apache.commons.net.ftp.FTPClient._connectAction_(FTPClient.java:540)
    at org.apache.commons.net.SocketClient.connect(SocketClient.java:178)
    at org.mule.transport.ftp.FtpConnectionFactory.makeObject(FtpConnectionFactory.java:33)
    at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
    at org.mule.transport.ftp.FtpConnector.getFtp(FtpConnector.java:172)
    at org.mule.transport.ftp.FtpConnector.createFtpClient(FtpConnector.java:637)
    at org.mule.transport.ftp.FtpMessageReceiver.listFiles(FtpMessageReceiver.java:134)
    at org.mule.transport.ftp.FtpMessageReceiver.poll(FtpMessageReceiver.java:94)
    at org.mule.transport.AbstractPollingMessageReceiver.performPoll(AbstractPollingMessageReceiver.java:216)
    at org.mule.transport.PollingReceiverWorker.poll(PollingReceiverWorker.java:80)
    at org.mule.transport.PollingReceiverWorker.run(PollingReceiverWorker.java:49)
    at org.mule.transport.TrackingWorkManager$TrackeableWork.run(TrackingWorkManager.java:267)
    at org.mule.work.WorkerContext.run(WorkerContext.java:286)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x00000007817a3540> (a java.util.concurrent.ThreadPoolExecutor$Worker)

另一个在使用 listFiles() 时挂在 pasv() 调用上:

receiver.137" prio=10 tid=0x00007f23e433b000 nid=0x7c06 runnable [0x00007f24c2fee000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    - locked <0x0000000788847ed0> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    - locked <0x0000000788847ed0> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:294)
    at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:490)
    at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:534)
    at org.apache.commons.net.ftp.FTP.sendCommand(FTP.java:583)
    at org.apache.commons.net.ftp.FTP.pasv(FTP.java:882)
    at org.apache.commons.net.ftp.FTPClient._openDataConnection_(FTPClient.java:497)
    at org.apache.commons.net.ftp.FTPClient.initiateListParsing(FTPClient.java:2296)
    at org.apache.commons.net.ftp.FTPClient.initiateListParsing(FTPClient.java:2269)
    at org.apache.commons.net.ftp.FTPClient.initiateListParsing(FTPClient.java:2189)
    at org.apache.commons.net.ftp.FTPClient.initiateListParsing(FTPClient.java:2132)
    at org.mule.transport.ftp.FtpMessageReceiver.listFiles(FtpMessageReceiver.java:135)
    at org.mule.transport.ftp.FtpMessageReceiver.poll(FtpMessageReceiver.java:94)
    at org.mule.transport.AbstractPollingMessageReceiver.performPoll(AbstractPollingMessageReceiver.java:216)
    at org.mule.transport.PollingReceiverWorker.poll(PollingReceiverWorker.java:80)
    at org.mule.transport.PollingReceiverWorker.run(PollingReceiverWorker.java:49)
    at org.mule.transport.TrackingWorkManager$TrackeableWork.run(TrackingWorkManager.java:267)
    at org.mule.work.WorkerContext.run(WorkerContext.java:286)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x0000000788832180> (a java.util.concurrent.ThreadPoolExecutor$Worker)

我认为问题是由于在 Mule 默认 FtpConnectionFactory 中使用默认 FTPClient 构造函数(扩展 SocketClient)引起的。

注意 setConnectTimeout() 值似乎仅在调用 socket.connect() 时使用,但在使用相同套接字的其他操作中被忽略:

protected FTPClient createFtpClient()
    {
        FTPClient ftpClient = new FTPClient();
        ftpClient.setConnectTimeout(connectionTimeout);

        return ftpClient;
    }

它使用 FTPClient() 构造函数,它本身使用在创建套接字时定义的超时为 0 的 SocketClient。

public  SocketClient()
    {
        ...
        _timeout_ = 0;
        ...
    }

然后我们调用connec(),它调用_connectAction()_。

在 SocketClient 中:

protected void  _connectAction_() throws IOException
    {
        ...
        _socket_.setSoTimeout(_timeout_);
        ...
    }

在 FTP 中,一个新的 Reader 实例化了我们的永久套接字:

protected _connectAction_(){

    ...
_controlInput_ =
         new BufferedReader(new InputStreamReader(_socket_.getInputStream(),
                                                  getControlEncoding()));
    ...
}

然后,当调用 __getReply() 函数时,我们使用这个 Reader-with-everlasting-socket:

private void  __getReply() throws IOException
     {
        ...
         String line = _controlInput_.readLine();
        ...
    }

很抱歉 post,但我认为这需要正确的解释。一个解决方案可能是在 connect() 之后调用 setSoTimeout(),以定义套接字超时。

默认超时似乎不是一个可以接受的解决方案,因为每个用户可能有不同的需求,默认值在任何情况下都不合适。 https://issues.apache.org/jira/browse/NET-35

最后,这提出了 2 个问题:

  1. 这对我来说似乎是一个错误,因为它会完全停止 FTP 轮询而不会出错。你怎么看?
  2. 有什么简单的方法可以避免这种情况?使用自定义 FtpConnectionFactory 调用 setSoTimeout()?我是否在某处缺少配置或参数?

提前致谢。

编辑:我正在使用 Mule CE Standalone 3.5.0,它似乎使用 Apache Commons Net 2.0。但是看代码,Mule CE Standalone 3.7 和 Commons Net 2.2 似乎没有什么不同。以下是涉及的源代码:

https://github.com/mulesoft/mule/blob/mule-3.5.x/transports/ftp/src/main/java/org/mule/transport/ftp/FtpConnectionFactory.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/SocketClient.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/ftp/FTP.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/ftp/FTPClient.java

在理想情况下,超时不是必需的,但在您的情况下似乎是必需的。

你的描述很全面,有没有考虑养一个bug

要解决此问题,我建议首先在高级选项卡中使用 "Response Timeout"。如果这不起作用,我会使用 service override,从那里你应该能够覆盖接收器。

我使用 MockFtpServer 重现了之前两个案例中的错误,并且我能够使用 FtpConnectionFactory 似乎可以解决问题。

public class SafeFtpConnectionFactory extends FtpConnectionFactory{

    //define a default timeout
    public static int defaultTimeout = 60000;
    public static synchronized int getDefaultTimeout() {
        return defaultTimeout;
    }
    public static synchronized void setDefaultTimeout(int defaultTimeout) {
        SafeFtpConnectionFactory.defaultTimeout = defaultTimeout;
    }

    public SafeFtpConnectionFactory(EndpointURI uri) {
        super(uri);
    }

    @Override
    protected FTPClient createFtpClient() {
        FTPClient client = super.createFtpClient();

        //Define the default timeout here, which will be used by the socket by default,
        //instead of the 0 timeout hanging indefinitely
        client.setDefaultTimeout(getDefaultTimeout());

        return client;
    }
}

然后将其连接到我的连接器:

<ftp:connector name="archivingFtpConnector" doc:name="FTP"
        pollingFrequency="${frequency}"
        validateConnections="true"
        connectionFactoryClass="my.comp.SafeFtpConnectionFactory">
    <reconnect frequency="${reconnection.frequency}" count="${reconnection.attempt}"/>
</ftp:connector>

使用这个配置,在指定超时后会抛出一个java.net.SocketTimeoutException,比如:

java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:154)
    at java.io.BufferedReader.readLine(BufferedReader.java:317)
    at java.io.BufferedReader.readLine(BufferedReader.java:382)
    at org.apache.commons.net.ftp.FTP.__getReply(FTP.java:294)
    at org.apache.commons.net.ftp.FTP._connectAction_(FTP.java:364)
    at org.apache.commons.net.ftp.FTPClient._connectAction_(FTPClient.java:540)
    at org.apache.commons.net.SocketClient.connect(SocketClient.java:178)
    at org.mule.transport.ftp.FtpConnectionFactory.makeObject(FtpConnectionFactory.java:33)
    at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
    at org.mule.transport.ftp.FtpConnector.getFtp(FtpConnector.java:172)
    at org.mule.transport.ftp.FtpConnector.createFtpClient(FtpConnector.java:637)
    ...

否则,对 connect() 或 pasv() 的尝试将在没有服务器响应的情况下无限期挂起。我使用 mock FTP.

重现了这个确切的行为

注意:我使用了 setDefaultTimeout(),因为它似乎是与 connect() 和 connectAction() 一起使用的变量(来自 SocketClient 源):

public abstract class SocketClient
{
    ...
    protected void _connectAction_() throws IOException
    {
        ...
        _socket_.setSoTimeout(_timeout_);
        ...
    }
    ...
    public void  setDefaultTimeout(int timeout)
    {
        _timeout_ = timeout;
    }
    ...
}

编辑: 对于那些感兴趣的人,这里是模拟 FTP 的测试代码,用于重现永不应答的服务器。无限循环远非好的做法。它应该替换为带有封闭测试的睡眠之类的东西 class 期望 SocketTimeout 异常并确保在给定超时后失败。

    private static final int CONTROL_PORT = 2121;

    public void startStubFtpServer(){
        FakeFtpServer fakeFtpServer = new FakeFtpServer();

        //define the command which should never be answered
        fakeFtpServer.setCommandHandler(CommandNames.PASV, new EverlastingCommandHandler());
        //fakeFtpServer.setCommandHandler(CommandNames.CONNECT, new EverlastingConnectCommandHandler());
        //or any other command...

        //server config
        ...

        //start server
        fakeFtpServer.setServerControlPort(CONTROL_PORT);
        fakeFtpServer.start();

        ...

    }

    //will cause any command received to never have an answer
    public class EverlastingConnectCommandHandler extends org.mockftpserver.core.command.AbstractStaticReplyCommandHandler{
        @Override
        protected void handleCommand(Command cmd, Session session, InvocationRecord rec) throws Exception {
            while(true){
                try {
                    Thread.sleep(60000);
                } catch (InterruptedException e) {
                    //TODO
                }
            }
        }

    }
    public class EverlastingCommandHandler extends AbstractFakeCommandHandler {
        @Override
        protected void handle(Command cmd, Session session) {
            while(true){
                try {
                    Thread.sleep(60000);
                } catch (InterruptedException e) {
                    //TODO
                }
            }
        }
    };