在浏览器中使用 LOAD CSV 时,如何让 Cypher 尊重字符编码?
How do I make Cypher respect character encoding when using LOAD CSV in browser?
我的案例:丹麦名字的学生名单(名字包括 ü、æ、ø、å 等字符)。最小工作示例
CSV 文件:
Fornavn;Efternavn;Mobil;Adresse
Øjvind;Ørnenæb;87654321;Paradisæblevej 125, 5610 Åkirkeby
Süzette;Ågård;12345678;Ærøvej 123, 2000 Frederiksberg
浏览器内 neo4j 编辑器:
$ LOAD CSV WITH HEADERS FROM 'file:///path/to/file.csv' AS line FIELDTERMINATOR ";"
CREATE (:Elev {fornavn: line.Fornavn, efternavn: line.Efternavn, mobil: line.Mobilnr, adresse: line.Adresse})
导致注册如下:
Neo4j browser screenshot, containing ?-characters, where Danish/German characters are wanted. My data come from a Learning Management System into Excel. When exporting as CSV from Excel, I can control file encoding as a function of the Save As dialogue box. I have tried encoding from Excel as "UTF-8" (which the Neo4j manual says 它想要)、"ISO-Western European"、"Windows-Western European"、"Unicode" 在单独命名的文件中,并相应地调整 FROM 'file:///path/to/file.csv'
子句。
有趣的是,我在 "Saving As" 时从 Excel 请求了完全相同的错误陈述结果,与哪种(明显的?)文件编码无关。将名称和地址直接复制粘贴到编辑器中时,我没有遇到同样的问题。
检查 Michael Hunger's blog post here 其中包含一些提示,即:
if you use non-ascii characters (umlauts, accents etc.) make sure to use the appropriate locale or provide the System property -Dfile.encoding=UTF8
我的案例:丹麦名字的学生名单(名字包括 ü、æ、ø、å 等字符)。最小工作示例 CSV 文件:
Fornavn;Efternavn;Mobil;Adresse
Øjvind;Ørnenæb;87654321;Paradisæblevej 125, 5610 Åkirkeby
Süzette;Ågård;12345678;Ærøvej 123, 2000 Frederiksberg
浏览器内 neo4j 编辑器:
$ LOAD CSV WITH HEADERS FROM 'file:///path/to/file.csv' AS line FIELDTERMINATOR ";"
CREATE (:Elev {fornavn: line.Fornavn, efternavn: line.Efternavn, mobil: line.Mobilnr, adresse: line.Adresse})
导致注册如下:
Neo4j browser screenshot, containing ?-characters, where Danish/German characters are wanted. My data come from a Learning Management System into Excel. When exporting as CSV from Excel, I can control file encoding as a function of the Save As dialogue box. I have tried encoding from Excel as "UTF-8" (which the Neo4j manual says 它想要)、"ISO-Western European"、"Windows-Western European"、"Unicode" 在单独命名的文件中,并相应地调整 FROM 'file:///path/to/file.csv'
子句。
有趣的是,我在 "Saving As" 时从 Excel 请求了完全相同的错误陈述结果,与哪种(明显的?)文件编码无关。将名称和地址直接复制粘贴到编辑器中时,我没有遇到同样的问题。
检查 Michael Hunger's blog post here 其中包含一些提示,即:
if you use non-ascii characters (umlauts, accents etc.) make sure to use the appropriate locale or provide the System property
-Dfile.encoding=UTF8