logstash 到 webhdfs 未能 APPEND_FILE /user/

Question

我尝试通过 logstash 将 csv 文件 vrom filebeat 摄取到 hdfs 中。 Filebeat 成功地将它传输到 logstash，因为我使用 stdout{codec=>rubydebug} 并且我可以看到它们 parsed.Seems 就像在 webhdfs 模块内部时开始出现的问题。

logstash-sample.conf


input {
  beats {
    port => 5044
  }
}
output {
        stdout{
                codec=>rubydebug
        }
        webhdfs{
                host=>"x.x.x.x"
                port => 50070
                path => "/user/logstash/df=%{+YYY-MM-dd}/logstash-%{+HH}.log"
                user => "root"
        }
}

错误

[WARN ] 2020-06-25 11:43:27.437 [[main]>worker0] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"RecoveryInProgressException","javaClassName":"org.apache.hadoop.hdfs.protocol.RecoveryInProgressException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_412561106_43 on 10.64.2.236 because lease recovery is in progress. Try again later.\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2591)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:27.465 [[main]>worker0] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"RecoveryInProgressException","javaClassName":"org.apache.hadoop.hdfs.protocol.RecoveryInProgressException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_-352502454_44 on 10.64.2.236 because another recovery is in progress by DFSClient_NONMAPREDUCE_-601230909_43 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2599)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:27.984 [[main]>worker0] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"RecoveryInProgressException","javaClassName":"org.apache.hadoop.hdfs.protocol.RecoveryInProgressException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_557749444_45 on 10.64.2.236 because another recovery is in progress by DFSClient_NONMAPREDUCE_-601230909_43 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2599)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:29.010 [[main]>worker0] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"RecoveryInProgressException","javaClassName":"org.apache.hadoop.hdfs.protocol.RecoveryInProgressException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_1871050100_46 on 10.64.2.236 because another recovery is in progress by DFSClient_NONMAPREDUCE_-601230909_43 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2599)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:30.547 [[main]>worker0] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.64.2.236:50010,DS-99ecf21e-ad4a-41f0-a3ae-7d430e2f5ea0,DISK]], original=[DatanodeInfoWithStorage[10.64.2.236:50010,DS-99ecf21e-ad4a-41f0-a3ae-7d430e2f5ea0,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration."}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[ERROR] 2020-06-25 11:43:32.570 [[main]>worker0] webhdfs - Max write retries reached. Events will be discarded. Exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_2073135911_40 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}
[WARN ] 2020-06-25 11:43:33.528 [Ruby-0-Thread-5: :1] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_1033064335_41 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:33.556 [Ruby-0-Thread-5: :1] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_1192450546_42 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:34.077 [Ruby-0-Thread-5: :1] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_-405184820_43 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:35.099 [Ruby-0-Thread-5: :1] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_2036955244_44 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[WARN ] 2020-06-25 11:43:36.621 [Ruby-0-Thread-5: :1] webhdfs - webhdfs write caused an exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_245206024_45 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}. Maybe you should increase retry_interval or reduce number of workers. Retrying...
[ERROR] 2020-06-25 11:43:38.653 [Ruby-0-Thread-5: :1] webhdfs - Max write retries reached. Events will be discarded. Exception: {"RemoteException":{"exception":"AlreadyBeingCreatedException","javaClassName":"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException","message":"Failed to APPEND_FILE /user/logstash/df=2020-06-25/logstash-11.log for DFSClient_NONMAPREDUCE_331276113_46 on 10.64.2.236 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1387502348_39 on 10.64.2.236\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2604)\n\tat org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2639)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:805)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:487)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)\n\tat org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)\n"}}

编辑

我已将 logstash 文件夹更改为用户可以访问 hadoop fs -chmod -R 777 /user/logstash 这只会更改现有文件夹，但每次我运行 logstash 时，它都会生成无法访问的新文件

root@ambari:/etc/logstash/conf.d# hdfs dfs -ls /user/logstash
Found 1 items
drwxrwxrwx   - root hdfs          0 2020-06-25 13:23 /user/logstash/df=2020-06-25
root@ambari:/etc/logstash/conf.d# hdfs dfs -ls /user/logstash/df=2020-06-25
Found 2 items
-rwxrwxrwx   3 root hdfs         63 2020-06-25 12:43 /user/logstash/df=2020-06-25/logstash-11.log
-rw-r--r--   3 root hdfs         63 2020-06-25 13:23 /user/logstash/df=2020-06-25/logstash-13.log

编辑2

我已经将所有者更改为 logstash

root@ambari:/etc/logstash# sudo -u hdfs hdfs dfs -chown -R logstash /user/logstash
root@ambari:/etc/logstash# hdfs dfs -ls /user
Found 6 items
drwxrwx---   - ambari-qa hdfs          0 2020-05-28 03:08 /user/ambari-qa
drwxr-xr-x   - hadoop    hdfs          0 2020-06-25 11:21 /user/hadoop
drwxr-xr-x   - hdfs      hdfs          0 2020-06-02 05:08 /user/hdfs
drwxrwxrwx   - logstash  hdfs          0 2020-06-26 07:23 /user/logstash
drwxrwxrwx   - root      hdfs          0 2020-06-26 09:53 /user/root
drwxr-xr-x   - hdfs      hdfs          0 2020-06-02 04:48 /user/sample

但是当我运行它时，它仍然返回相同的错误日志

问题

即使我 chown /user/logstash 他们会继续生成当前用户无法访问的文件，我应该怎么做才能让用户 append/access 该文件？非常感谢

Answer 1

@尤连森。您的进程正在跨 hdfs 数据节点追加文件。我能够在互联网上找到一些相关信息，搜索“达到 hdfs 最大写入重试次数。事件将被丢弃。”

Locked file leases are a known issue when appending with multiple threads to the same file via webhdfs.

With my research and tryouts, it turned out that if the replication factor for HDFS is greater than the number of datanodes available, append would fail to write as it needs to take care of multiple copies. Try setting a low replica factor for the log file.

https://github.com/logstash-plugins/logstash-output-webhdfs/issues/15

https://github.com/logstash-plugins/logstash-output-webhdfs/issues/35

上面的 link 中有一个补丁，以及一些关于调整 HDFS worker、副本和配置的额外建议。在 hadoop 中，研究错误语句中的整个输出很重要。它们很长，而且离找到消息的右边很远。

logstash 到 webhdfs 未能 APPEND_FILE /user/

logstash to webhdfs Failed to APPEND_FILE /user/

hadoop

hdfs

logstash

webhdfs

logstash-sample.conf

错误

编辑

编辑2

问题