Impala - 找不到文件错误
Impala - file not found error
我正在使用 impala 和 flume 作为文件流。
问题是 flume 正在添加扩展名为 .tmp 的临时文件,然后当它们被删除时 impala 查询失败并显示以下消息:
Backend 0:Failed to open HDFS file
hdfs://localhost:8020/user/hive/../FlumeData.1420040201733.tmp
Error(2): No such file or directory
如何让 impala 忽略此 tmp 文件,或 flume 不写入它们,或将它们写入另一个目录?
Flume 配置:
### Agent2 - Avro Source and File Channel, hdfs Sink ###
# Name the components on this agent
Agent2.sources = avro-source
Agent2.channels = file-channel
Agent2.sinks = hdfs-sink
# Describe/configure Source
Agent2.sources.avro-source.type = avro
Agent2.sources.avro-source.hostname = 0.0.0.0
Agent2.sources.avro-source.port = 11111
Agent2.sources.avro-source.bind = 0.0.0.0
# Describe the sink
Agent2.sinks.hdfs-sink.type = hdfs
Agent2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/hive/table/
Agent2.sinks.hdfs-sink.hdfs.rollInterval = 0
Agent2.sinks.hdfs-sink.hdfs.rollCount = 10000
Agent2.sinks.hdfs-sink.hdfs.fileType = DataStream
#Use a channel which buffers events in file
Agent2.channels.file-channel.type = file
Agent2.channels.file-channel.checkpointDir = /home/ubutnu/flume/checkpoint/
Agent2.channels.file-channel.dataDirs = /home/ubuntu/flume/data/
# Bind the source and sink to the channel
Agent2.sources.avro-source.channels = file-channel
Agent2.sinks.hdfs-sink.channel = file-channel
我遇到过一次这个问题。
我已经升级了 hadoop 和 flume,它已经解决了。 (从cloudera hadoop cdh-5.2 到cdh-5.3)
尝试升级 - hadoop,flume 或 impala。
我正在使用 impala 和 flume 作为文件流。
问题是 flume 正在添加扩展名为 .tmp 的临时文件,然后当它们被删除时 impala 查询失败并显示以下消息:
Backend 0:Failed to open HDFS file hdfs://localhost:8020/user/hive/../FlumeData.1420040201733.tmp Error(2): No such file or directory
如何让 impala 忽略此 tmp 文件,或 flume 不写入它们,或将它们写入另一个目录?
Flume 配置:
### Agent2 - Avro Source and File Channel, hdfs Sink ###
# Name the components on this agent
Agent2.sources = avro-source
Agent2.channels = file-channel
Agent2.sinks = hdfs-sink
# Describe/configure Source
Agent2.sources.avro-source.type = avro
Agent2.sources.avro-source.hostname = 0.0.0.0
Agent2.sources.avro-source.port = 11111
Agent2.sources.avro-source.bind = 0.0.0.0
# Describe the sink
Agent2.sinks.hdfs-sink.type = hdfs
Agent2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/hive/table/
Agent2.sinks.hdfs-sink.hdfs.rollInterval = 0
Agent2.sinks.hdfs-sink.hdfs.rollCount = 10000
Agent2.sinks.hdfs-sink.hdfs.fileType = DataStream
#Use a channel which buffers events in file
Agent2.channels.file-channel.type = file
Agent2.channels.file-channel.checkpointDir = /home/ubutnu/flume/checkpoint/
Agent2.channels.file-channel.dataDirs = /home/ubuntu/flume/data/
# Bind the source and sink to the channel
Agent2.sources.avro-source.channels = file-channel
Agent2.sinks.hdfs-sink.channel = file-channel
我遇到过一次这个问题。
我已经升级了 hadoop 和 flume,它已经解决了。 (从cloudera hadoop cdh-5.2 到cdh-5.3)
尝试升级 - hadoop,flume 或 impala。