STORM:storm-hdfs hdfs blot 在 24 小时后失败

STORM: storm-hdfs hdfs blolt failing after 24hrs

我的从 kafka 读取并写入 hadoop hdfs 的风暴拓扑在 24 小时后就失败了!!

我怀疑问题是,拓扑无法更新 tokens/not 找到要更新的密钥表。请分享您的想法并帮助我解决问题。

请查找用于配置 hdfs bolt 的代码..

配置对象:

//building a 'map' with hdfs related configuration for key tab
Map<String, Object> hdfsSecConfigMap = new HashMap<String, Object>();
hdfsSecConfigMap.put("hdfs.keytab.file", ktPath);
hdfsSecConfigMap.put("hdfs.kerberos.principal", ktPrincipal);

//building a 'map' with hbase related configuration
Map<String, Object> hbaseConfigMap = new HashMap<String, Object>();
hbaseConfigMap.put("hbase.rootdir", hbaseRootDir);
hbaseConfigMap.put("storm.keytab.file", ktPath);
hbaseConfigMap.put("storm.kerberos.principal", ktPrincipal);

Config configured = new Config();
configured.setDebug(true);
configured.put(hdfsConfKey, hdfsSecConfigMap);
configured.put(hbaseConfKey, hbaseConfigMap);
configured.setNumWorkers(2);
configured.setMaxSpoutPending(300);
configured.setNumAckers(30);
configured.setMessageTimeoutSecs(1200);

configured.put(HdfsSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath);
configured.put(HdfsSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal);

configured.put(HBaseSecurityUtil.STORM_KEYTAB_FILE_KEY, ktPath);
configured.put(HBaseSecurityUtil.STORM_USER_NAME_KEY, ktPrincipal);

正在检索 hdfs bolt

HdfsBolt hdfsbolt = new HdfsBolt()
        .withFsUrl(hdfsuri)
        .withRecordFormat(recFormat)
        .withFileNameFormat(fileNameWithPath)
        .withRotationPolicy(fileRotationSize)
        .withSyncPolicy(syncPolicy)
        .withConfigKey(secBypassConfigKey);

下面的 TopologyBuilder 设置

builder.setBolt(“hdfsBolt", avroHDFSBolt, 1)
        .setNumTasks(1)
        .shuffleGrouping(“kafka-spout");

面临的异常如下:

java.io.IOException: IOException flush:java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: “**********"; destination host is: “***************":8020;
        at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2082) ~[stormjar.jar:?]
        at org.apache.hadoop.hdfs.DFSOutputStream.hsync(DFSOutputStream.java:1969) ~[stormjar.jar:?]
        at org.apache.hadoop.hdfs.client.HdfsDataOutputStream.hsync(HdfsDataOutputStream.java:95) ~[stormjar.jar:?]
        at org.apache.storm.hdfs.bolt.HdfsBolt.execute(HdfsBolt.java:100) [stormjar.jar:?]
        at backtype.storm.daemon.executor$fn__3697$tuple_action_fn__3699.invoke(executor.clj:670) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.daemon.executor$mk_task_receiver$fn__3620.invoke(executor.clj:426) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.disruptor$clojure_handler$reify__3196.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.daemon.executor$fn__3697$fn__3710$fn__3761.invoke(executor.clj:808) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at backtype.storm.util$async_loop$fn__544.invoke(util.clj:475) [storm-core-0.10.0.2.3.4.0-3485.jar:0.10.0.2.3.4.0-3485]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_73]

根据我们的 hadoop 集群中使用的正确版本的 hadoop 重新构建我的 code/application 后,我能够解决这个问题。

由于版本不匹配而观察到该问题,并在使用正确的版本重新构建后修复了!!