在 Watson Knowledge Studio 中为自定义实体类型创建 json 文件
Create json file for custom entity types in Watson Knowledge Studio
我正在尝试为 WKS 实例上传一组自定义实体类型和子类型。
这是您可以定义实体和子实体的部分中的 WKS 界面视图。
上传按钮请求 json 文件。
我之前手动创建了一个集合,并下载了 json 文件。
它的第一行如下:
{"entityTypes":[{**"id":"78361798-b77e-4728-9b6a-f56539c12bcd"**,"label":"Calificativo","sireProp":{"mentionType":null,"subtypes":["Bueno_extremo","Bueno_moderado","Regular","Malo_moderado","Malo_extremo"],"roles":["78361798-b77e-4728-9b6a-f56539c12bcd"],"clazz":null,"color":null,"hotkey":null,"backGroundColor":null,"active":true,"roleOnly":false},"creationDate":1583241330349,"source":null,"modifiedDate":0,"typeType":null,"typeClass":null,"typeVersion":null,"typeDesc":null,"typeSuperType":null,"typeSuperTypeId":null,"typeCreateDate":null,"typeUpdateDate":null,"typeProvenance":null,"alchemyAPITypes":null,"nluAPITypes":null},{**"id":"daecb92b-0ce7-4a47-942a-68b50d0cb2fd"**,"label":"TV","sireProp":{"mentionType":null,"subtypes":["Decodificador","Servicio_de_tv"],"roles":
总的来说内容结构清晰,但是实体集和内容集都有ID。
我想知道是否有办法提前知道,或者生成这些 ID,这样我就可以用我想使用的类型和子类型生成整个 json,然后再上传。
我尝试使用“”代替 ID,但收到一条错误消息,不允许上传。
WKS 不支持导入自定义 json 文件,这些文件不同于根据 document 从 WKS 工作区导出的文件。但是,据我所知,UUID 可能是一个有效的 id
字段值,由以下 bash 命令生成。
$ uuidgen | tr '[:upper:]' '[:lower:]'
此 Python 脚本生成一个 json 文件,其格式可被 WKS 理解:
import uuid
import json
# Generate IDs
ent_id,lbl01_id = uuid.uuid4(), uuid.uuid4()
json_out = {}
json_out.update({
"entityTypes":[{
"id":str(lbl01_id), "label":"Calificativo",
"sireProp":
{
"mentionType":None,
"subtypes":["Bueno_extremo", "Bueno_moderado", "Regular", "Malo_moderado", "Malo_extremo"],
"roles":[str(lbl01_id)], "clazz":None, # Roles relates to self & other labels, if any
"color":None, "hotkey":None, "backGroundColor":None, "active":True, "roleOnly":False
},
"creationDate":1583241330349, "source":None, "modifiedDate":1583247016579, "typeType":None,
"typeClass":None, "typeVersion":None, "typeDesc":None, "typeSuperType":None, "typeSuperTypeId":None,
"typeCreateDate":None, "typeUpdateDate":None, "typeProvenance":None, "alchemyAPITypes":None,
"nluAPITypes":None
}],
"sireInfo":{
"entityProp":{
"mentionType":[{"color":"white", "hotkey":"1", "backGroundColor":"#AA00FF", "name":"NAM"},
{"color":"black", "hotkey":"2", "backGroundColor":"#00FF7F", "name":"NOM"},
{"color":"black", "hotkey":"3", "backGroundColor":"#AAFFFF", "name":"PRO"},
{"color":"white", "hotkey":"4", "backGroundColor":"gray", "name":"NONE"}],
"subtypes":None,
"roles":None,
"clazz":[{"color":"#A5A5A5", "hotkey":"3", "backGroundColor":"white", "name":"SPC"},
{"color":"black", "hotkey":"2", "backGroundColor":"#00FF7F", "name":"NEG"},
{"color":"black", "hotkey":"1", "backGroundColor":"#AAFFFF", "name":"GEN"}],
"color":None,
"hotkey":None,
"backGroundColor":None,
"active":True,
"roleOnly":False
},
"relationProp":{
"tense":[{"name":"PAST"}, {"name":"PRESENT"}, {"name":"FUTURE"}, {"name":"UNSPECIFIED"}],
"modality":[{"name":"ASSERTED"}, {"name":"OTHER"}],
"clazz":[{"name":"SPECIFIC"}, {"name":"NEG"}, {"name":"OTHER"}],
"backGroundColor":None, "color":None, "hotkey":None, "active":True}
},
"functionalEntityTypes":[
{"id":"CATCH_ALL_ENTITY_ID", "label":"*",
"sireProp":{
"mentionType":None, "subtypes":None, "roles":None, "clazz":None, "color":None,
"hotkey":None, "backGroundColor":None, "active":True, "roleOnly":False},
"creationDate":1487227572757, "source":None, "modifiedDate":0, "typeType":None,
"typeClass":None, "typeVersion":None, "typeDesc":None, "typeSuperType":None,
"typeSuperTypeId":None, "typeCreateDate":None, "typeUpdateDate":None, "typeProvenance":None,
"alchemyAPITypes":None, "nluAPITypes":None
}],
"pid":str(ent_id), "modified_date":1583247016579, "kgimported":False
})
with open('json_file.json', 'w') as outfile:
json.dump(json_out, outfile)
这只会生成 1 个实体;生成更多,就在 "id" 和 "nluAPITypes" 之间,与要添加的实体一样多。
这里也可以包含"relationshipTypes"
我正在尝试为 WKS 实例上传一组自定义实体类型和子类型。
这是您可以定义实体和子实体的部分中的 WKS 界面视图。
上传按钮请求 json 文件。
我之前手动创建了一个集合,并下载了 json 文件。
它的第一行如下:
{"entityTypes":[{**"id":"78361798-b77e-4728-9b6a-f56539c12bcd"**,"label":"Calificativo","sireProp":{"mentionType":null,"subtypes":["Bueno_extremo","Bueno_moderado","Regular","Malo_moderado","Malo_extremo"],"roles":["78361798-b77e-4728-9b6a-f56539c12bcd"],"clazz":null,"color":null,"hotkey":null,"backGroundColor":null,"active":true,"roleOnly":false},"creationDate":1583241330349,"source":null,"modifiedDate":0,"typeType":null,"typeClass":null,"typeVersion":null,"typeDesc":null,"typeSuperType":null,"typeSuperTypeId":null,"typeCreateDate":null,"typeUpdateDate":null,"typeProvenance":null,"alchemyAPITypes":null,"nluAPITypes":null},{**"id":"daecb92b-0ce7-4a47-942a-68b50d0cb2fd"**,"label":"TV","sireProp":{"mentionType":null,"subtypes":["Decodificador","Servicio_de_tv"],"roles":
总的来说内容结构清晰,但是实体集和内容集都有ID。
我想知道是否有办法提前知道,或者生成这些 ID,这样我就可以用我想使用的类型和子类型生成整个 json,然后再上传。
我尝试使用“”代替 ID,但收到一条错误消息,不允许上传。
WKS 不支持导入自定义 json 文件,这些文件不同于根据 document 从 WKS 工作区导出的文件。但是,据我所知,UUID 可能是一个有效的 id
字段值,由以下 bash 命令生成。
$ uuidgen | tr '[:upper:]' '[:lower:]'
此 Python 脚本生成一个 json 文件,其格式可被 WKS 理解:
import uuid
import json
# Generate IDs
ent_id,lbl01_id = uuid.uuid4(), uuid.uuid4()
json_out = {}
json_out.update({
"entityTypes":[{
"id":str(lbl01_id), "label":"Calificativo",
"sireProp":
{
"mentionType":None,
"subtypes":["Bueno_extremo", "Bueno_moderado", "Regular", "Malo_moderado", "Malo_extremo"],
"roles":[str(lbl01_id)], "clazz":None, # Roles relates to self & other labels, if any
"color":None, "hotkey":None, "backGroundColor":None, "active":True, "roleOnly":False
},
"creationDate":1583241330349, "source":None, "modifiedDate":1583247016579, "typeType":None,
"typeClass":None, "typeVersion":None, "typeDesc":None, "typeSuperType":None, "typeSuperTypeId":None,
"typeCreateDate":None, "typeUpdateDate":None, "typeProvenance":None, "alchemyAPITypes":None,
"nluAPITypes":None
}],
"sireInfo":{
"entityProp":{
"mentionType":[{"color":"white", "hotkey":"1", "backGroundColor":"#AA00FF", "name":"NAM"},
{"color":"black", "hotkey":"2", "backGroundColor":"#00FF7F", "name":"NOM"},
{"color":"black", "hotkey":"3", "backGroundColor":"#AAFFFF", "name":"PRO"},
{"color":"white", "hotkey":"4", "backGroundColor":"gray", "name":"NONE"}],
"subtypes":None,
"roles":None,
"clazz":[{"color":"#A5A5A5", "hotkey":"3", "backGroundColor":"white", "name":"SPC"},
{"color":"black", "hotkey":"2", "backGroundColor":"#00FF7F", "name":"NEG"},
{"color":"black", "hotkey":"1", "backGroundColor":"#AAFFFF", "name":"GEN"}],
"color":None,
"hotkey":None,
"backGroundColor":None,
"active":True,
"roleOnly":False
},
"relationProp":{
"tense":[{"name":"PAST"}, {"name":"PRESENT"}, {"name":"FUTURE"}, {"name":"UNSPECIFIED"}],
"modality":[{"name":"ASSERTED"}, {"name":"OTHER"}],
"clazz":[{"name":"SPECIFIC"}, {"name":"NEG"}, {"name":"OTHER"}],
"backGroundColor":None, "color":None, "hotkey":None, "active":True}
},
"functionalEntityTypes":[
{"id":"CATCH_ALL_ENTITY_ID", "label":"*",
"sireProp":{
"mentionType":None, "subtypes":None, "roles":None, "clazz":None, "color":None,
"hotkey":None, "backGroundColor":None, "active":True, "roleOnly":False},
"creationDate":1487227572757, "source":None, "modifiedDate":0, "typeType":None,
"typeClass":None, "typeVersion":None, "typeDesc":None, "typeSuperType":None,
"typeSuperTypeId":None, "typeCreateDate":None, "typeUpdateDate":None, "typeProvenance":None,
"alchemyAPITypes":None, "nluAPITypes":None
}],
"pid":str(ent_id), "modified_date":1583247016579, "kgimported":False
})
with open('json_file.json', 'w') as outfile:
json.dump(json_out, outfile)
这只会生成 1 个实体;生成更多,就在 "id" 和 "nluAPITypes" 之间,与要添加的实体一样多。
这里也可以包含"relationshipTypes"