异步工作者死了! ... clojure.lang.PersistentVector 无法转换为 class java.lang.String

Async worker died! ... clojure.lang.PersistentVector cannot be cast to class java.lang.String

我是爬虫世界的新手,Java 所以请直截了当。

我已经成功地注入了我的目标 URL,但是当我 运行 本地爬虫时,我的异步工作者死了...当 运行 远程爬虫没有死,但它没有死在 seeds.txt 中抓取我的网址。

在这种情况下,谷歌搜索对我没有帮助。

即使在 --remote 运行 中我也没有爬行(见图):

crawler stats

下面是打印出来的情况。

知道是什么原因造成的吗?

...
[snipped]
...

7841 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0xdc zxid:0x3c txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout
7842 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0xdf zxid:0x3d txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout
7842 [Thread-49-spout-executor[14 14]] ERROR o.a.s.d.executor - 
java.lang.ClassCastException: class clojure.lang.PersistentVector cannot be cast to class java.lang.String (clojure.lang.PersistentVector is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
        at com.digitalpebble.stormcrawler.util.ConfUtils.getString(ConfUtils.java:74) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AbstractSpout.open(AbstractSpout.java:188) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AggregationSpout.open(AggregationSpout.java:98) ~[dev1-0.1.jar:?]
        at org.apache.storm.daemon.executor$fn__10112$fn__10127.invoke(executor.clj:609) ~[storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:482) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7843 [Thread-55-spout-executor[17 17]] ERROR o.a.s.d.executor - 
java.lang.ClassCastException: class clojure.lang.PersistentVector cannot be cast to class java.lang.String (clojure.lang.PersistentVector is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
        at com.digitalpebble.stormcrawler.util.ConfUtils.getString(ConfUtils.java:74) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AbstractSpout.open(AbstractSpout.java:188) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AggregationSpout.open(AggregationSpout.java:98) ~[dev1-0.1.jar:?]
        at org.apache.storm.daemon.executor$fn__10112$fn__10127.invoke(executor.clj:609) ~[storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:482) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7845 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0xe6 zxid:0x3e txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout
7839 [Thread-33-spout-executor[9 9]] INFO  c.d.s.e.p.AbstractSpout - [spout #1]  assigned shard ID 1
7846 [Thread-33-spout-executor[9 9]] ERROR o.a.s.util - Async loop died!
java.lang.ClassCastException: class clojure.lang.PersistentVector cannot be cast to class java.lang.String (clojure.lang.PersistentVector is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
        at com.digitalpebble.stormcrawler.util.ConfUtils.getString(ConfUtils.java:74) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AbstractSpout.open(AbstractSpout.java:188) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AggregationSpout.open(AggregationSpout.java:98) ~[dev1-0.1.jar:?]
        at org.apache.storm.daemon.executor$fn__10112$fn__10127.invoke(executor.clj:609) ~[storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:482) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7846 [Thread-33-spout-executor[9 9]] ERROR o.a.s.d.executor - 
java.lang.ClassCastException: class clojure.lang.PersistentVector cannot be cast to class java.lang.String (clojure.lang.PersistentVector is in unnamed module of loader 'app'; java.lang.String is in module java.base of loader 'bootstrap')
        at com.digitalpebble.stormcrawler.util.ConfUtils.getString(ConfUtils.java:74) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AbstractSpout.open(AbstractSpout.java:188) ~[dev1-0.1.jar:?]
        at com.digitalpebble.stormcrawler.elasticsearch.persistence.AggregationSpout.open(AggregationSpout.java:98) ~[dev1-0.1.jar:?]
        at org.apache.storm.daemon.executor$fn__10112$fn__10127.invoke(executor.clj:609) ~[storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:482) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7847 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0xf0 zxid:0x42 txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout
7863 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0x105 zxid:0x4b txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout-last-error Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout-last-error
7863 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0x106 zxid:0x4c txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout-last-error Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout-last-error
7875 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0x111 zxid:0x50 txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout-last-error Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout-last-error
7875 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0x112 zxid:0x51 txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout-last-error Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout-last-error
7875 [ProcessThread(sid:0 cport:2000):] INFO  o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x100040091f2000f type:create cxid:0x114 zxid:0x53 txntype:-1 reqpath:n/a Error Path:/storm/errors/crawler-1-1574269039/spout-last-error Error:KeeperErrorCode = NodeExists for /storm/errors/crawler-1-1574269039/spout-last-error
7886 [Thread-19-spout-executor[12 12]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7886 [Thread-17-spout-executor[8 8]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7886 [Thread-49-spout-executor[14 14]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7892 [Thread-39-spout-executor[16 16]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7892 [Thread-35-spout-executor[11 11]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7893 [Thread-47-spout-executor[10 10]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7895 [Thread-55-spout-executor[17 17]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7895 [Thread-41-spout-executor[13 13]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
7895 [Thread-53-spout-executor[15 15]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
        at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
        at org.apache.storm.daemon.worker$fn__10799$fn__10800.invoke(worker.clj:788) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.daemon.executor$mk_executor_data$fn__9997$fn__9998.invoke(executor.clj:281) [storm-core-1.2.3.jar:1.2.3]
        at org.apache.storm.util$async_loop$fn__624.invoke(util.clj:494) [storm-core-1.2.3.jar:1.2.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]

com.digitalpebble.stormcrawler.util.ConfUtils.getString(ConfUtils.java:74) ~[dev1-0.1.jar:?]

 72     public static String getString(Map<String, Object> conf, String key,
 73             String defaultValue) {
 74         return (String) Utils.get(conf, key, defaultValue);
 75     }
 76 

这可能是由于 StormCrawler 版本与您的 ES 配置版本不匹配,特别是 _ es.status.bucket.sort.field_ 直到最近才采用多值。

要么使用与您的 SC 版本兼容的 conf 版本,here for 1.15,要么从 master 分支签出 SC 并编译它。