Orientdb 将 csv 导入文档模型
Orientdb Import csv to document Model
我正在尝试使用 ETL 将 csv 文件导入到 Orientdb 中的文档模型
作为新手,我不知道这是否正确,也没有很多关于文档模型的文档,但我试过的是:
{
"config": {
"log": "debug"
},
"begin": [],
"source": {
"file": {
"path": "C:/Users/M/Desktop/files/lact.csv"
}
},
"extractor":
{ "csv":
{ "separator": ",",
"nullValue": "NULL"
}
},
"transformers": [
{
"log": {}
}
],
"loader": {
"orientdb": {
"dbURL": "plocal:../databases/Model_doc",
"dbType": "document",
"classes": [
{
"name": "Annotations"
},
]
}
},
"end": []
}
在显示文件内容的解析后,我得到了这样的话:
[orientdb] DEBUG orientdb: 在 class 'null'
中找到 0 个文档
Csv 文件
"Entry","Entry_name","Status","Protein_names","Gene_names","Organism","Length","Cross_reference(STRING)"
"Q29836","1B67_HUMAN","reviewed","HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67)","HLA-B HLAB","Homo sapiens (Human)","362","9606.ENSP00000399168;"
"P30501","1C02_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2)","HLA-C HLAC","Homo sapiens (Human)","366",""
"P30508","1C12_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29960","1C16_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29865","1C18_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18)","HLA-C HLAC","Homo sapiens (Human)","366",""
你需要给文档分配一个class,将字段转换器添加到链中,就在log
之后
"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Annotations"
}
}
],
我试过你的代码,我得到了同样的信息:
[orientdb] DEBUG orientdb: found 0 documents in class 'null'
但我已经能够导入所有数据,正如您从我的屏幕截图中看到的那样。
如@RobertoFranchini 所说,您必须添加以下内容:
"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Annotations"
}
}
],
我对你的 csv 文件做了这个小改动:
Entry,Entry_name,Status,Protein_names,Gene_names,Organism,Length,Cross_reference(STRING)
Q29836,1B67_HUMAN,reviewed,HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67),HLA-B HLAB,Homo sapiens (Human),362,9606.ENSP00000399168
P30501,1C02_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2),HLA-C HLAC,Homo sapiens (Human),366,
P30508,1C12_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12),HLA-C HLAC,Homo sapiens (Human),366,
Q29960,1C16_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16),HLA-C HLAC,Homo sapiens (Human),366,
Q29865,1C18_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18),HLA-C HLAC,Homo sapiens (Human),366,
所有数据都已导入。
希望对您有所帮助。
此致。
我正在尝试使用 ETL 将 csv 文件导入到 Orientdb 中的文档模型 作为新手,我不知道这是否正确,也没有很多关于文档模型的文档,但我试过的是:
{
"config": {
"log": "debug"
},
"begin": [],
"source": {
"file": {
"path": "C:/Users/M/Desktop/files/lact.csv"
}
},
"extractor":
{ "csv":
{ "separator": ",",
"nullValue": "NULL"
}
},
"transformers": [
{
"log": {}
}
],
"loader": {
"orientdb": {
"dbURL": "plocal:../databases/Model_doc",
"dbType": "document",
"classes": [
{
"name": "Annotations"
},
]
}
},
"end": []
}
在显示文件内容的解析后,我得到了这样的话: [orientdb] DEBUG orientdb: 在 class 'null'
中找到 0 个文档Csv 文件
"Entry","Entry_name","Status","Protein_names","Gene_names","Organism","Length","Cross_reference(STRING)"
"Q29836","1B67_HUMAN","reviewed","HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67)","HLA-B HLAB","Homo sapiens (Human)","362","9606.ENSP00000399168;"
"P30501","1C02_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2)","HLA-C HLAC","Homo sapiens (Human)","366",""
"P30508","1C12_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29960","1C16_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16)","HLA-C HLAC","Homo sapiens (Human)","366",""
"Q29865","1C18_HUMAN","reviewed","HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18)","HLA-C HLAC","Homo sapiens (Human)","366",""
你需要给文档分配一个class,将字段转换器添加到链中,就在log
之后"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Annotations"
}
}
],
我试过你的代码,我得到了同样的信息:
[orientdb] DEBUG orientdb: found 0 documents in class 'null'
但我已经能够导入所有数据,正如您从我的屏幕截图中看到的那样。
如@RobertoFranchini 所说,您必须添加以下内容:
"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Annotations"
}
}
],
我对你的 csv 文件做了这个小改动:
Entry,Entry_name,Status,Protein_names,Gene_names,Organism,Length,Cross_reference(STRING)
Q29836,1B67_HUMAN,reviewed,HLA class I histocompatibility antigen, B-67 alpha chain (MHC class I antigen B*67),HLA-B HLAB,Homo sapiens (Human),362,9606.ENSP00000399168
P30501,1C02_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-2 alpha chain (MHC class I antigen Cw*2),HLA-C HLAC,Homo sapiens (Human),366,
P30508,1C12_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-12 alpha chain (MHC class I antigen Cw*12),HLA-C HLAC,Homo sapiens (Human),366,
Q29960,1C16_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-16 alpha chain (MHC class I antigen Cw*16),HLA-C HLAC,Homo sapiens (Human),366,
Q29865,1C18_HUMAN,reviewed,HLA class I histocompatibility antigen, Cw-18 alpha chain (MHC class I antigen Cw*18),HLA-C HLAC,Homo sapiens (Human),366,
所有数据都已导入。
希望对您有所帮助。
此致。