使用 fscrawler 的 ElasticSearch 文件映射和在 C# 中通过 NEST 搜索文档
ElasticSearch file mapping using fscrawler and searching doc by NEST in C#
i 使用 fscrawler 2.3-SNAPSHOT 索引了文件夹“/tmp/es”中的文档。它将它们映射为:
{
"properties" : {
"attachment" : {
"type" : "binary",
"doc_values": false
},
"attributes" : {
"properties" : {
"group" : {
"type" : "keyword"
},
"owner" : {
"type" : "keyword"
}
}
},
"content" : {
"type" : "text"
},
"file" : {
"properties" : {
"content_type" : {
"type" : "keyword"
},
"filename" : {
"type" : "keyword"
},
"extension" : {
"type" : "keyword"
},
"filesize" : {
"type" : "long"
},
"indexed_chars" : {
"type" : "long"
},
"indexing_date" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"last_modified" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"checksum": {
"type": "keyword"
},
"url" : {
"type" : "keyword",
"index" : true
}
}
},
"object" : {
"type" : "object"
},
"meta" : {
"properties" : {
"author" : {
"type" : "text"
},
"date" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"keywords" : {
"type" : "text"
},
"title" : {
"type" : "text"
},
"language" : {
"type" : "keyword"
}
}
},
"path" : {
"properties" : {
"encoded" : {
"type" : "keyword"
},
"real" : {
"type" : "keyword",
"fields": {
"tree": {
"type" : "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
},
"root" : {
"type" : "keyword"
},
"virtual" : {
"type" : "keyword",
"fields": {
"tree": {
"type" : "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
}
}
}
}
}
现在,我想在我的 C# 应用程序中使用 NEST 搜索它们,我能够通过 hit.source.content
获取 content 但无法通过 [=13= 获取文件名]...
代码:
var response = elasticClient.Search<documents>(s => s
.Index("tanks")
.Type("doc")
.Query(q => q.QueryString(qs => qs.Query(query))));
if (rtxSearchResult.Text != " ")
{
rtxSearchResult.Text = " ";
foreach (var hit in response.Hits)
{
rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
+ Environment.NewLine
+ "Content: " + hit.Source.content.ToString()
+ Environment.NewLine
+ "URL: " + hit.Source.url.ToString()
+ Environment.NewLine
+ Environment.NewLine);
}
}
上面抛出 NULLException 但当我用 hit.Source.url
和 hit.Source.filename
.
注释行时运行
Kibana 将文件名字段显示为 file.filename
,将 url 显示为 file.url
,将内容显示为 content
.
由于文件名嵌套在文件下,我无法检索它...请帮助卡在这里几天..
发现错误:
我的文档 class 是:
Class documents
{
Public string filename { get; set; }
Public string content { get; set; }
Public string url { get; set; }
}
由于文件名和 url 与 file.filename
和 file.url
相同,我们需要另一个 class 文件,文件名和 url.
Class documents
{
Public File file { get; set; }
Public string content { get; set; }
}
Class File
{
Public string filename { get; set; }
Public string url { get; set; }
}
因此我能够通过 hit.Source.file.filename
和 hit.Source.file.url
访问它们。
i 使用 fscrawler 2.3-SNAPSHOT 索引了文件夹“/tmp/es”中的文档。它将它们映射为:
{
"properties" : {
"attachment" : {
"type" : "binary",
"doc_values": false
},
"attributes" : {
"properties" : {
"group" : {
"type" : "keyword"
},
"owner" : {
"type" : "keyword"
}
}
},
"content" : {
"type" : "text"
},
"file" : {
"properties" : {
"content_type" : {
"type" : "keyword"
},
"filename" : {
"type" : "keyword"
},
"extension" : {
"type" : "keyword"
},
"filesize" : {
"type" : "long"
},
"indexed_chars" : {
"type" : "long"
},
"indexing_date" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"last_modified" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"checksum": {
"type": "keyword"
},
"url" : {
"type" : "keyword",
"index" : true
}
}
},
"object" : {
"type" : "object"
},
"meta" : {
"properties" : {
"author" : {
"type" : "text"
},
"date" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"keywords" : {
"type" : "text"
},
"title" : {
"type" : "text"
},
"language" : {
"type" : "keyword"
}
}
},
"path" : {
"properties" : {
"encoded" : {
"type" : "keyword"
},
"real" : {
"type" : "keyword",
"fields": {
"tree": {
"type" : "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
},
"root" : {
"type" : "keyword"
},
"virtual" : {
"type" : "keyword",
"fields": {
"tree": {
"type" : "text",
"analyzer": "fscrawler_path",
"fielddata": true
}
}
}
}
}
}
}
现在,我想在我的 C# 应用程序中使用 NEST 搜索它们,我能够通过 hit.source.content
获取 content 但无法通过 [=13= 获取文件名]...
代码:
var response = elasticClient.Search<documents>(s => s
.Index("tanks")
.Type("doc")
.Query(q => q.QueryString(qs => qs.Query(query))));
if (rtxSearchResult.Text != " ")
{
rtxSearchResult.Text = " ";
foreach (var hit in response.Hits)
{
rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
+ Environment.NewLine
+ "Content: " + hit.Source.content.ToString()
+ Environment.NewLine
+ "URL: " + hit.Source.url.ToString()
+ Environment.NewLine
+ Environment.NewLine);
}
}
上面抛出 NULLException 但当我用 hit.Source.url
和 hit.Source.filename
.
Kibana 将文件名字段显示为 file.filename
,将 url 显示为 file.url
,将内容显示为 content
.
由于文件名嵌套在文件下,我无法检索它...请帮助卡在这里几天..
发现错误:
我的文档 class 是:
Class documents
{
Public string filename { get; set; }
Public string content { get; set; }
Public string url { get; set; }
}
由于文件名和 url 与 file.filename
和 file.url
相同,我们需要另一个 class 文件,文件名和 url.
Class documents
{
Public File file { get; set; }
Public string content { get; set; }
}
Class File
{
Public string filename { get; set; }
Public string url { get; set; }
}
因此我能够通过 hit.Source.file.filename
和 hit.Source.file.url
访问它们。