导入 Google API JSON 文件到 Elasticsearch
Import Google API JSON file to Elasticsearch
我完全不熟悉 ELK 堆栈,尤其是 ES。
我正在尝试导入使用 Google Admin SDK API 获得的 JSON 文件,我想将其导入 Elasticsearch。
到目前为止,这是我数据的 JSON 结构:
{
"kind": "reports#activities",
"nextPageToken": string,
"items": [
{
"kind": "audit#activity",
"id": {
"time": datetime,
"uniqueQualifier": long,
"applicationName": string,
"customerId": string
},
"actor": {
"callerType": string,
"email": string,
"profileId": long,
"key": string
},
"ownerDomain": string,
"ipAddress": string,
"events": [
{
"type": string,
"name": string,
"parameters": [
{
"name": string,
"value": string,
"intValue": long,
"boolValue": boolean
}
]
}
]
}
]
}
所以我决定首先使用这个命令将JSON文件上传到ES中:
curl -s -XPOST 'localhost:9200/_bulk' --data-binary @documents.json
但是我遇到了一些错误:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"}],"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"},"status":400}
我该怎么办?
感谢您的帮助!
JSON 似乎在定义您的文档结构,因此您首先需要创建一个索引,其中包含与该结构匹配的映射。在你的情况下,你可以这样做:
curl -XPUT localhost:9200/reports -d '{
"nextPageToken": {
"type": "string"
},
"items": {
"properties": {
"kind": {
"type": "string"
},
"id": {
"properties": {
"time": {
"type": "date",
"format": "date_time"
},
"uniqueQualifier": {
"type": "long"
},
"applicationName": {
"type": "string"
},
"customerId": {
"type": "string"
}
}
},
"actor": {
"properties": {
"callerType": {
"type": "string"
},
"email": {
"type": "string"
},
"profileId": {
"type": "long"
},
"key": {
"type": "string"
}
}
},
"ownerDomain": {
"type": "string"
},
"ipAddress": {
"type": "string"
},
"events": {
"properties": {
"type": {
"type": "string"
},
"name": {
"type": "string"
},
"parameters": {
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
},
"intValue": {
"type": "long"
},
"boolValue": {
"type": "boolean"
}
}
}
}
}
}
}
}'
完成后,您现在可以使用批量调用为遵循上述结构的 reports#activities
文档编制索引。批量调用的语法是精确定义的here,即你需要一个命令行(做什么),在下一行后面是文档源(索引什么),它不能包含任何新行!
因此,您需要像这样重新格式化 documents.json
文件(确保在第二行之后添加新行)。另请注意,我添加了一些虚拟数据来说明该过程:
{"index": {"_index": "reports", "_type": "activity"}}
{"kind":"reports#activities","nextPageToken":"string","items":[{"kind":"audit#activity","id":{"time":"2016-05-31T00:00:00.000Z","uniqueQualifier":1,"applicationName":"string","customerId":"string"},"actor":{"callerType":"string","email":"string","profileId":1,"key":"string"},"ownerDomain":"string","ipAddress":"string","events":[{"type":"string","name":"string","parameters":[{"name":"string","value":"string","intValue":1,"boolValue":true}]}]}]}
我完全不熟悉 ELK 堆栈,尤其是 ES。 我正在尝试导入使用 Google Admin SDK API 获得的 JSON 文件,我想将其导入 Elasticsearch。
到目前为止,这是我数据的 JSON 结构:
{
"kind": "reports#activities",
"nextPageToken": string,
"items": [
{
"kind": "audit#activity",
"id": {
"time": datetime,
"uniqueQualifier": long,
"applicationName": string,
"customerId": string
},
"actor": {
"callerType": string,
"email": string,
"profileId": long,
"key": string
},
"ownerDomain": string,
"ipAddress": string,
"events": [
{
"type": string,
"name": string,
"parameters": [
{
"name": string,
"value": string,
"intValue": long,
"boolValue": boolean
}
]
}
]
}
]
}
所以我决定首先使用这个命令将JSON文件上传到ES中:
curl -s -XPOST 'localhost:9200/_bulk' --data-binary @documents.json
但是我遇到了一些错误:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"}],"type":"illegal_argument_exception","reason":"Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [START_ARRAY]"},"status":400}
我该怎么办?
感谢您的帮助!
JSON 似乎在定义您的文档结构,因此您首先需要创建一个索引,其中包含与该结构匹配的映射。在你的情况下,你可以这样做:
curl -XPUT localhost:9200/reports -d '{
"nextPageToken": {
"type": "string"
},
"items": {
"properties": {
"kind": {
"type": "string"
},
"id": {
"properties": {
"time": {
"type": "date",
"format": "date_time"
},
"uniqueQualifier": {
"type": "long"
},
"applicationName": {
"type": "string"
},
"customerId": {
"type": "string"
}
}
},
"actor": {
"properties": {
"callerType": {
"type": "string"
},
"email": {
"type": "string"
},
"profileId": {
"type": "long"
},
"key": {
"type": "string"
}
}
},
"ownerDomain": {
"type": "string"
},
"ipAddress": {
"type": "string"
},
"events": {
"properties": {
"type": {
"type": "string"
},
"name": {
"type": "string"
},
"parameters": {
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
},
"intValue": {
"type": "long"
},
"boolValue": {
"type": "boolean"
}
}
}
}
}
}
}
}'
完成后,您现在可以使用批量调用为遵循上述结构的 reports#activities
文档编制索引。批量调用的语法是精确定义的here,即你需要一个命令行(做什么),在下一行后面是文档源(索引什么),它不能包含任何新行!
因此,您需要像这样重新格式化 documents.json
文件(确保在第二行之后添加新行)。另请注意,我添加了一些虚拟数据来说明该过程:
{"index": {"_index": "reports", "_type": "activity"}}
{"kind":"reports#activities","nextPageToken":"string","items":[{"kind":"audit#activity","id":{"time":"2016-05-31T00:00:00.000Z","uniqueQualifier":1,"applicationName":"string","customerId":"string"},"actor":{"callerType":"string","email":"string","profileId":1,"key":"string"},"ownerDomain":"string","ipAddress":"string","events":[{"type":"string","name":"string","parameters":[{"name":"string","value":"string","intValue":1,"boolValue":true}]}]}]}