使用 fscrawler 的 ElasticSearch 文件映射和在 C# 中通过 NEST 搜索文档

ElasticSearch file mapping using fscrawler and searching doc by NEST in C#

i 使用 fscrawler 2.3-SNAPSHOT 索引了文件夹“/tmp/es”中的文档。它将它们映射为:

{
  "properties" : {
    "attachment" : {
      "type" : "binary",
      "doc_values": false
    },
    "attributes" : {
      "properties" : {
        "group" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        }
      }
    },
    "content" : {
      "type" : "text"
    },
    "file" : {
      "properties" : {
        "content_type" : {
          "type" : "keyword"
        },
        "filename" : {
          "type" : "keyword"
        },
        "extension" : {
          "type" : "keyword"
        },
        "filesize" : {
          "type" : "long"
        },
        "indexed_chars" : {
          "type" : "long"
        },
        "indexing_date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "last_modified" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "checksum": {
          "type": "keyword"
        },
        "url" : {
          "type" : "keyword",
          "index" : true
        }
      }
    },
    "object" : {
      "type" : "object"
    },
    "meta" : {
      "properties" : {
        "author" : {
          "type" : "text"
        },
        "date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "keywords" : {
          "type" : "text"
        },
        "title" : {
          "type" : "text"
        },
        "language" : {
          "type" : "keyword"
        }
      }
    },
    "path" : {
      "properties" : {
        "encoded" : {
          "type" : "keyword"
        },
        "real" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        },
        "root" : {
          "type" : "keyword"
        },
        "virtual" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        }
      }
    }
  }
}

现在,我想在我的 C# 应用程序中使用 NEST 搜索它们,我能够通过 hit.source.content 获取 content 但无法通过 [=13= 获取文件名]...

代码:

 var response = elasticClient.Search<documents>(s => s
                .Index("tanks")
                .Type("doc")
                .Query(q => q.QueryString(qs => qs.Query(query))));

            if (rtxSearchResult.Text != " ")
            {
                rtxSearchResult.Text = " ";

                foreach (var hit in response.Hits)
                {


                    rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
                    + Environment.NewLine
                    + "Content: " + hit.Source.content.ToString()
                    + Environment.NewLine
                    + "URL: " + hit.Source.url.ToString()
                    + Environment.NewLine
                    + Environment.NewLine);
                }
            }

上面抛出 NULLException 但当我用 hit.Source.urlhit.Source.filename.

注释行时运行

Kibana 将文件名字段显示为 file.filename,将 url 显示为 file.url,将内容显示为 content.

由于文件名嵌套在文件下,我无法检索它...请帮助卡在这里几天..

发现错误:

我的文档 class 是:

Class documents
{
      Public string filename { get; set; }

      Public string content { get; set; }

      Public string url { get; set; }
}

由于文件名和 url 与 file.filenamefile.url 相同,我们需要另一个 class 文件,文件名和 url.

Class documents
{
      Public File file { get; set; }

      Public string content { get; set; }

}

Class File
{
          Public string filename { get; set; }

          Public string url { get; set; }
}

因此我能够通过 hit.Source.file.filenamehit.Source.file.url 访问它们。