使用 neo4j-import 工具导入 40M+ 数据时出错
Error on importing 40M+ data using neo4j-import tool
我使用neo4j-import
导入40M节点,下面是我的shell:
[luning@pinnacle bin]$ ./neo4j-import --into ../data/weibo.db --nodes:User "/data/weibo/user-header.csv,/data/weibo/users/000000_0.csv,/data/weibo/users/000001_0.csv,/data/weibo/users/000002_0.csv,/data/weibo/users/000003_0.csv,/data/weibo/users/000004_0.csv,/data/weibo/users/000005_0.csv,/data/weibo/users/000006_0.csv,/data/weibo/users/000007_0.csv,/data/weibo/users/000008_0.csv,/data/weibo/users/000009_0.csv,/data/weibo/users/000010_0.csv,/data/weibo/users/000011_0.csv,/data/weibo/users/000012_0.csv,/data/weibo/users/000013_0.csv,/data/weibo/users/000014_0.csv,/data/weibo/users/000015_0.csv,/data/weibo/users/000016_0.csv,/data/weibo/users/000017_0.csv,/data/weibo/users/000018_0.csv,/data/weibo/users/000019_0.csv,/data/weibo/users/000020_0.csv,/data/weibo/users/000021_0.csv,/data/weibo/users/000022_0.csv,/data/weibo/users/000023_1.csv,/data/weibo/users/000024_0.csv,/data/weibo/users/000025_0.csv" --delimiter "TAB"
Nodes
[*>:87.20 MB/s---------------------------|PROPERTIES(2)===============|NOD|v:227.03 MB/s(2)====] 48MImport error: Panic called, so exiting
Neo4j Import Tool
neo4j-import is used to create a new Neo4j database from data in CSV files. See
the chapter "Import Tool" in the Neo4j Manual for details on the CSV file format
- a special kind of header is required.
Usage:
--into <store-dir>
Database directory to import into. Must not contain existing database.
--nodes [:Label1:Label2] "<file1>,<file2>,..."
Node CSV header and data. Multiple files will be logically seen as one big file
from the perspective of the importer. The first line must contain the header.
Multiple data sources like these can be specified in one import, where each data
source has its own header. Note that file groups must be enclosed in quotation
marks.
--relationships [:RELATIONSHIP_TYPE] "<file1>,<file2>,..."
Relationship CSV header and data. Multiple files will be logically seen as one
big file from the perspective of the importer. The first line must contain the
header. Multiple data sources like these can be specified in one import, where
each data source has its own header. Note that file groups must be enclosed in
quotation marks.
--delimiter <delimiter-character>
Delimiter character, or 'TAB', between values in CSV data. The default option is
,.
--array-delimiter <array-delimiter-character>
Delimiter character, or 'TAB', between array elements within a value in CSV
我检查了他们的模式。他们都是一致的。它显示
Import error: Panic called, so exiting
有人知道怎么解决吗?
下面是我的堆栈跟踪:
java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:151)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.assertHealthy(AbstractStep.java:189)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.await(AbstractStep.java:180)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.receive(ExecutorServiceStep.java:82)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.sendDownstream(AbstractStep.java:226)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.call(ExecutorServiceStep.java:103)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.call(ExecutorServiceStep.java:87)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: ERROR in input
data source: BufferedCharSeeker[buffer:org.neo4j.csv.reader.SectionedCharBuffer@4ac5af5c, seekPos:2764030, line:2882236]
in field: descriptions:string:4
for header: [id:string, screenname:string, locations:string, descriptions:string, :IGNORE, profileimageurl:string, gender:string, followerscount:string, friendscount:string, statusescount:string, favouritescount:string, verified:string, verifiedreason:string, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, darenint:string, :IGNORE, :IGNORE, updateddate:string]
raw field value: 6:19:
original error: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:152)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:42)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.helpers.collection.NestingIterator.fetchNextOrNull(NestingIterator.java:61)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.staging.IteratorBatcherStep.nextBatchOrNull(IteratorBatcherStep.java:54)
at org.neo4j.unsafe.impl.batchimport.staging.InputIteratorBatcherStep.nextBatchOrNull(InputIteratorBatcherStep.java:42)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep.process(ProducerStep.java:73)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep.run(ProducerStep.java:54)
Caused by: java.lang.IllegalStateException: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.csv.reader.BufferedCharSeeker.fillBufferIfWeHaveExhaustedIt(BufferedCharSeeker.java:258)
at org.neo4j.csv.reader.BufferedCharSeeker.nextChar(BufferedCharSeeker.java:231)
at org.neo4j.csv.reader.BufferedCharSeeker.seek(BufferedCharSeeker.java:109)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:81)
... 10 more
其中一个字段可能有一个未结束该引号的引号...因此 CSV 解析器将读取和读取直到找到下一个引号。你不太可能在那里有一个 8M 大的字段,所以这就是我的想法。
当我尝试使用此方法将数据导入图表时,我也遇到了 "Import error: Executor has been shut down" 和 "Import error: Panic called, so exiting" 错误。
当我收到这些错误时,我的数据没有引号字符(" 和 ')。
解决我问题的方法是删除所有其他特殊字符。
我可能遗漏了文档中的某些内容,因为我认为节点属性中的所有文本都将作为字符串读入。结果 neo4j-import
不喜欢“&”和“/”这样的字符!
当我编辑我的数据(是的 sed
!)以仅包含字母数字字符时,导入工具运行完美。
我遇到了同样的错误并删除了特殊字符,例如“*”、“&”、“/”,但保留单引号足以消除错误。
我使用neo4j-import
导入40M节点,下面是我的shell:
[luning@pinnacle bin]$ ./neo4j-import --into ../data/weibo.db --nodes:User "/data/weibo/user-header.csv,/data/weibo/users/000000_0.csv,/data/weibo/users/000001_0.csv,/data/weibo/users/000002_0.csv,/data/weibo/users/000003_0.csv,/data/weibo/users/000004_0.csv,/data/weibo/users/000005_0.csv,/data/weibo/users/000006_0.csv,/data/weibo/users/000007_0.csv,/data/weibo/users/000008_0.csv,/data/weibo/users/000009_0.csv,/data/weibo/users/000010_0.csv,/data/weibo/users/000011_0.csv,/data/weibo/users/000012_0.csv,/data/weibo/users/000013_0.csv,/data/weibo/users/000014_0.csv,/data/weibo/users/000015_0.csv,/data/weibo/users/000016_0.csv,/data/weibo/users/000017_0.csv,/data/weibo/users/000018_0.csv,/data/weibo/users/000019_0.csv,/data/weibo/users/000020_0.csv,/data/weibo/users/000021_0.csv,/data/weibo/users/000022_0.csv,/data/weibo/users/000023_1.csv,/data/weibo/users/000024_0.csv,/data/weibo/users/000025_0.csv" --delimiter "TAB"
Nodes
[*>:87.20 MB/s---------------------------|PROPERTIES(2)===============|NOD|v:227.03 MB/s(2)====] 48MImport error: Panic called, so exiting
Neo4j Import Tool
neo4j-import is used to create a new Neo4j database from data in CSV files. See
the chapter "Import Tool" in the Neo4j Manual for details on the CSV file format
- a special kind of header is required.
Usage:
--into <store-dir>
Database directory to import into. Must not contain existing database.
--nodes [:Label1:Label2] "<file1>,<file2>,..."
Node CSV header and data. Multiple files will be logically seen as one big file
from the perspective of the importer. The first line must contain the header.
Multiple data sources like these can be specified in one import, where each data
source has its own header. Note that file groups must be enclosed in quotation
marks.
--relationships [:RELATIONSHIP_TYPE] "<file1>,<file2>,..."
Relationship CSV header and data. Multiple files will be logically seen as one
big file from the perspective of the importer. The first line must contain the
header. Multiple data sources like these can be specified in one import, where
each data source has its own header. Note that file groups must be enclosed in
quotation marks.
--delimiter <delimiter-character>
Delimiter character, or 'TAB', between values in CSV data. The default option is
,.
--array-delimiter <array-delimiter-character>
Delimiter character, or 'TAB', between array elements within a value in CSV
我检查了他们的模式。他们都是一致的。它显示
Import error: Panic called, so exiting
有人知道怎么解决吗?
下面是我的堆栈跟踪:
java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:151)
at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: java.lang.RuntimeException: Panic called, so exiting
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.assertHealthy(AbstractStep.java:189)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.await(AbstractStep.java:180)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.receive(ExecutorServiceStep.java:82)
at org.neo4j.unsafe.impl.batchimport.staging.AbstractStep.sendDownstream(AbstractStep.java:226)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.call(ExecutorServiceStep.java:103)
at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep.call(ExecutorServiceStep.java:87)
at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: java.lang.RuntimeException: Panic called, so exiting
... 7 more
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: ERROR in input
data source: BufferedCharSeeker[buffer:org.neo4j.csv.reader.SectionedCharBuffer@4ac5af5c, seekPos:2764030, line:2882236]
in field: descriptions:string:4
for header: [id:string, screenname:string, locations:string, descriptions:string, :IGNORE, profileimageurl:string, gender:string, followerscount:string, friendscount:string, statusescount:string, favouritescount:string, verified:string, verifiedreason:string, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, :IGNORE, darenint:string, :IGNORE, :IGNORE, updateddate:string]
raw field value: 6:19:
original error: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:152)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:42)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.helpers.collection.NestingIterator.fetchNextOrNull(NestingIterator.java:61)
at org.neo4j.helpers.collection.PrefetchingIterator.peek(PrefetchingIterator.java:60)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:46)
at org.neo4j.unsafe.impl.batchimport.staging.IteratorBatcherStep.nextBatchOrNull(IteratorBatcherStep.java:54)
at org.neo4j.unsafe.impl.batchimport.staging.InputIteratorBatcherStep.nextBatchOrNull(InputIteratorBatcherStep.java:42)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep.process(ProducerStep.java:73)
at org.neo4j.unsafe.impl.batchimport.staging.ProducerStep.run(ProducerStep.java:54)
Caused by: java.lang.IllegalStateException: Tried to read in a value larger than effective buffer size 8388608
at org.neo4j.csv.reader.BufferedCharSeeker.fillBufferIfWeHaveExhaustedIt(BufferedCharSeeker.java:258)
at org.neo4j.csv.reader.BufferedCharSeeker.nextChar(BufferedCharSeeker.java:231)
at org.neo4j.csv.reader.BufferedCharSeeker.seek(BufferedCharSeeker.java:109)
at org.neo4j.unsafe.impl.batchimport.input.csv.InputEntityDeserializer.fetchNextOrNull(InputEntityDeserializer.java:81)
... 10 more
其中一个字段可能有一个未结束该引号的引号...因此 CSV 解析器将读取和读取直到找到下一个引号。你不太可能在那里有一个 8M 大的字段,所以这就是我的想法。
当我尝试使用此方法将数据导入图表时,我也遇到了 "Import error: Executor has been shut down" 和 "Import error: Panic called, so exiting" 错误。
当我收到这些错误时,我的数据没有引号字符(" 和 ')。
解决我问题的方法是删除所有其他特殊字符。
我可能遗漏了文档中的某些内容,因为我认为节点属性中的所有文本都将作为字符串读入。结果 neo4j-import
不喜欢“&”和“/”这样的字符!
当我编辑我的数据(是的 sed
!)以仅包含字母数字字符时,导入工具运行完美。
我遇到了同样的错误并删除了特殊字符,例如“*”、“&”、“/”,但保留单引号足以消除错误。