"content" 字段中的 solr 搜索不起作用
solr search in a "content" field does not work
我已经在 solr 6.0.0 中上传和提取文档,我看到它是使用以下查询建立索引的:
http://localhost:8983/solr/techproducts/select?indent=on&q=id:doc1&wt=json
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"q":"id:doc1",
"indent":"on",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"links":["http://www.education.gov.yk.ca/"],
"id":"doc1",
"last_modified":"2008-06-04T22:47:36Z",
"title":[" PDF Test Page"],
"content_type":["application/pdf"],
"author":"Yukon Canada Yukon Department of Education",
"author_s":"Yukon Canada Yukon Department of Education",
"content":[" \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n PDF Test Page \n \n \n \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader! You should be able to view any of the PDF documents and forms available on \nour site. PDF forms are indicated by these icons: or . \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at: http://www.education.gov.yk.ca/\n \n \n \n \n "],
"_version_":1533049305513852928}]
}}
我看到字段内容出现了很多单词PDF
。
当有一个字段名称 content
并且其中包含 PDF
时,为什么我使用以下查询没有得到任何结果?:
select?q=*:*&fq=content:PDF
{
"responseHeader":{
"status":0,
"QTime":4,
"params":{
"q":"*:*",
"indent":"on",
"fq":"content:PDF",
"rows":"50",
"wt":"json"}},
"response":{"numFound":0,"start":0,"docs":[]
}}
当我使用不同的字段进行查询时,例如 title
,那么我得到了正确的结果:
select?q=*:*&fq=title:PDF
{
"responseHeader":{
"status":0,
"QTime":3,
"params":{
"q":"*:*",
"indent":"on",
"fq":"title:PDF",
"rows":"50",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"links":["http://www.education.gov.yk.ca/"],
"id":"doc1",
"last_modified":"2008-06-04T22:47:36Z",
"title":[" PDF Test Page"],
"content_type":["application/pdf"],
"author":"Yukon Canada Yukon Department of Education",
"author_s":"Yukon Canada Yukon Department of Education",
"content":[" \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n PDF Test Page \n \n \n \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader! You should be able to view any of the PDF documents and forms available on \nour site. PDF forms are indicated by these icons: or . \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at: http://www.education.gov.yk.ca/\n \n \n \n \n "],
"_version_":1533049305513852928}]
}}
检查您的 schema.xml
是否为内容 field
定义的 field type
。
比较内容和标题字段的字段类型。
您可能没有为字段内容定义正确的字段类型。这些字段类型不会为您的文本生成任何标记,或者必须将整个文本视为一个整体。如果您在字段中使用 keywordtokenizer
或 string
字段类型,就会发生这种情况。
您可以在 Solr
调试器工具中检查相同的检查或分析。
在这里您可以查看文本的索引方式和搜索方式。
当你想在 field
上搜索时,你必须提到属性 indexed=true
并且你想要 solr 到 return 相同的值然后你需要添加 stored=true
.
这两个attribute
帮助您实现搜索和检索字段的原始值
我已经在 solr 6.0.0 中上传和提取文档,我看到它是使用以下查询建立索引的:
http://localhost:8983/solr/techproducts/select?indent=on&q=id:doc1&wt=json
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"q":"id:doc1",
"indent":"on",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"links":["http://www.education.gov.yk.ca/"],
"id":"doc1",
"last_modified":"2008-06-04T22:47:36Z",
"title":[" PDF Test Page"],
"content_type":["application/pdf"],
"author":"Yukon Canada Yukon Department of Education",
"author_s":"Yukon Canada Yukon Department of Education",
"content":[" \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n PDF Test Page \n \n \n \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader! You should be able to view any of the PDF documents and forms available on \nour site. PDF forms are indicated by these icons: or . \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at: http://www.education.gov.yk.ca/\n \n \n \n \n "],
"_version_":1533049305513852928}]
}}
我看到字段内容出现了很多单词PDF
。
当有一个字段名称 content
并且其中包含 PDF
时,为什么我使用以下查询没有得到任何结果?:
select?q=*:*&fq=content:PDF
{
"responseHeader":{
"status":0,
"QTime":4,
"params":{
"q":"*:*",
"indent":"on",
"fq":"content:PDF",
"rows":"50",
"wt":"json"}},
"response":{"numFound":0,"start":0,"docs":[]
}}
当我使用不同的字段进行查询时,例如 title
,那么我得到了正确的结果:
select?q=*:*&fq=title:PDF
{
"responseHeader":{
"status":0,
"QTime":3,
"params":{
"q":"*:*",
"indent":"on",
"fq":"title:PDF",
"rows":"50",
"wt":"json"}},
"response":{"numFound":1,"start":0,"docs":[
{
"links":["http://www.education.gov.yk.ca/"],
"id":"doc1",
"last_modified":"2008-06-04T22:47:36Z",
"title":[" PDF Test Page"],
"content_type":["application/pdf"],
"author":"Yukon Canada Yukon Department of Education",
"author_s":"Yukon Canada Yukon Department of Education",
"content":[" \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n PDF Test Page \n \n \n \n \nPDF Test File \n \nCongratulations, your computer is equipped with a PDF (Portable Document Format) \nreader! You should be able to view any of the PDF documents and forms available on \nour site. PDF forms are indicated by these icons: or . \n \nYukon Department of Education \nBox 2703 \nWhitehorse,Yukon \nCanada \nY1A 2C6 \n \nPlease visit our website at: http://www.education.gov.yk.ca/\n \n \n \n \n "],
"_version_":1533049305513852928}]
}}
检查您的 schema.xml
是否为内容 field
定义的 field type
。
比较内容和标题字段的字段类型。
您可能没有为字段内容定义正确的字段类型。这些字段类型不会为您的文本生成任何标记,或者必须将整个文本视为一个整体。如果您在字段中使用 keywordtokenizer
或 string
字段类型,就会发生这种情况。
您可以在 Solr
调试器工具中检查相同的检查或分析。
在这里您可以查看文本的索引方式和搜索方式。
当你想在 field
上搜索时,你必须提到属性 indexed=true
并且你想要 solr 到 return 相同的值然后你需要添加 stored=true
.
这两个attribute
帮助您实现搜索和检索字段的原始值