"content" 字段中的 solr 搜索不起作用

Question

我已经在 solr 6.0.0 中上传和提取文档，我看到它是使用以下查询建立索引的：

http://localhost:8983/solr/techproducts/select?indent=on&q=id:doc1&wt=json

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"id:doc1",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

我看到字段内容出现了很多单词PDF。

当有一个字段名称 content 并且其中包含 PDF 时，为什么我使用以下查询没有得到任何结果？:

select?q=*:*&fq=content:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":4,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"content:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}

当我使用不同的字段进行查询时，例如 title，那么我得到了正确的结果：

select?q=*:*&fq=title:PDF

{
  "responseHeader":{
    "status":0,
    "QTime":3,
    "params":{
      "q":"*:*",
      "indent":"on",
      "fq":"title:PDF",
      "rows":"50",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "links":["http://www.education.gov.yk.ca/"],
        "id":"doc1",
        "last_modified":"2008-06-04T22:47:36Z",
        "title":[" PDF Test Page"],
        "content_type":["application/pdf"],
        "author":"Yukon Canada Yukon Department of Education",
        "author_s":"Yukon Canada Yukon Department of Education",
        "content":[" \n \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  \n  PDF Test Page \n \n    \n  \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader!  You should be able to view any of the PDF documents and forms available on \nour site.  PDF forms are indicated by these icons:   or  .   \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at:  http://www.education.gov.yk.ca/\n    \n  \n    \n \n  "],
        "_version_":1533049305513852928}]
  }}

Answer 1

检查您的 schema.xml 是否为内容 field 定义的 field type。

比较内容和标题字段的字段类型。

您可能没有为字段内容定义正确的字段类型。这些字段类型不会为您的文本生成任何标记，或者必须将整个文本视为一个整体。如果您在字段中使用 keywordtokenizer 或 string 字段类型，就会发生这种情况。

您可以在 Solr 调试器工具中检查相同的检查或分析。

在这里您可以查看文本的索引方式和搜索方式。

当你想在 field 上搜索时，你必须提到属性 indexed=true 并且你想要 solr 到 return 相同的值然后你需要添加 stored=true.

这两个attribute帮助您实现搜索和检索字段的原始值

"content" 字段中的 solr 搜索不起作用

solr search in a "content" field does not work

pdf

search

select

solr

field