Solr/Lucene 简单运算符 "or" 误解/在不同领域搜索同一个词
Solr/Lucene simple operator "or" missunderstanding / Seach same word in differnt fields
学习 Solr/Lucene 语法,在浏览器中使用 Solr Admin。
在那里,我尝试使用以下语法在两个不同的字段中搜索相同的词:
content:myword
-> 找到结果
content:myword OR title:existingTitle
-> 找到结果
但是
content:myword OR title:myword -> ZERO results found
,为什么?是"or".
也试过没有运算符应该等于 "or" ,也试过 "|"和“||”
当我试图在多个字段之一中找到相同的词时会发生这种情况
[编辑]
这是 solr url 请求:
内容:车辆title:fahrzeug
http://xxx/solr/core_de/select?q=content%3Afahrzeug%20title%3Afahrzeug
内容:fahrzeug 或 title:fahrzeug
http://xxx/solr/core_de/select?q=content%3Afahrzeug%20OR%20title%3Afahrzeug
内容:车辆 | title:fahrzeug
http://xxx/solr/core_de/select?q=content%3Afahrzeug%20%7C%20title%3Afahrzeug
{
"responseHeader":{
"status":400,
"QTime":5,
"params":{
"q":"content:fahrzeug OR title:fahrzeug",
"debugQuery":"1"}},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"invalid boolean value: 1",
"code":400}}
我猜,它的配置是这样的:
尝试:
http://www119.pxia.de:8983/solr/core_de/select?fq=content%3Afahrzeug%20title%3Afahrzeug&q=*%3A*
- 这个 returns 正确的文件。所以如果只使用过滤,那些文件就在那里。查询使用更复杂的条件,你的默认配置是:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<str name="qf">content^40.0 title^5.0 keywords^2.0 tagsH1^5.0 tagsH2H3^3.0 tagsH4H5H6^2.0 tagsInline^1.0</str>
<str name="pf">content^2.0</str>
<str name="df">content</str>
<int name="ps">15</int>
<str name="mm">2<-35%</str>
<str name="mm.autoRelax">true</str>
...
解析器和提升可能在这里发挥关键作用。
我不熟悉 edixmax 解析器,请检查:documentation
我猜 mm
参数可能是造成这种情况的原因。
不管怎样,这很奇怪,OR 不能像我们从布尔代数中使用的那样工作。
"debug":{
"queryBoosting":{
"q":"title:Home OR content:Perfekt",
"match":null},
"rawquerystring":"title:Home OR content:Perfekt",
"querystring":"title:Home OR content:Perfekt",
"parsedquery":"+(title:hom content:perfekt)~2 ()",
"parsedquery_toString":"+((title:hom content:perfekt)~2) ()",
"explain":{
"bf72a75534ba703e4b8dc7194f92ce34223fc0d2/pages/1/0/0/0":"\n4.8893824 = sum of:\n 4.8893824 = sum of:\n 1.9924302 = weight(title:hom in 0) [SchemaSimilarity], result of:\n 1.9924302 = score(doc=0,freq=1.0 = termFreq=1.0\n), product of:\n 1.9924302 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 1.0 = docFreq\n 10.0 = docCount\n 1.0 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) from:\n 1.0 = termFreq=1.0\n 1.2 = parameter k1\n 0.0 = parameter b (norms omitted for field)\n 2.8969522 = weight(content:perfekt in 0) [SchemaSimilarity], result of:\n 2.8969522 = score(doc=0,freq=5.0 = termFreq=5.0\n), product of:\n 1.4816046 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 2.0 = docFreq\n 10.0 = docCount\n 1.9552802 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n 5.0 = termFreq=5.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 508.3 = avgFieldLength\n 184.0 = fieldLength\n"},
"QParser":"ExtendedDismaxQParser",
检查 "parsedquery":"+(title:hom content:perfekt)~2 ()"
它基本上说,标题和内容都必须存在:
Solr operators
学习 Solr/Lucene 语法,在浏览器中使用 Solr Admin。 在那里,我尝试使用以下语法在两个不同的字段中搜索相同的词:
content:myword
-> 找到结果
content:myword OR title:existingTitle
-> 找到结果
但是
content:myword OR title:myword -> ZERO results found
,为什么?是"or".
也试过没有运算符应该等于 "or" ,也试过 "|"和“||”
当我试图在多个字段之一中找到相同的词时会发生这种情况
[编辑]
这是 solr url 请求:
内容:车辆title:fahrzeug http://xxx/solr/core_de/select?q=content%3Afahrzeug%20title%3Afahrzeug
内容:fahrzeug 或 title:fahrzeug http://xxx/solr/core_de/select?q=content%3Afahrzeug%20OR%20title%3Afahrzeug
内容:车辆 | title:fahrzeug http://xxx/solr/core_de/select?q=content%3Afahrzeug%20%7C%20title%3Afahrzeug
{
"responseHeader":{
"status":400,
"QTime":5,
"params":{
"q":"content:fahrzeug OR title:fahrzeug",
"debugQuery":"1"}},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"invalid boolean value: 1",
"code":400}}
我猜,它的配置是这样的:
尝试:
http://www119.pxia.de:8983/solr/core_de/select?fq=content%3Afahrzeug%20title%3Afahrzeug&q=*%3A*
- 这个 returns 正确的文件。所以如果只使用过滤,那些文件就在那里。查询使用更复杂的条件,你的默认配置是:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<str name="qf">content^40.0 title^5.0 keywords^2.0 tagsH1^5.0 tagsH2H3^3.0 tagsH4H5H6^2.0 tagsInline^1.0</str>
<str name="pf">content^2.0</str>
<str name="df">content</str>
<int name="ps">15</int>
<str name="mm">2<-35%</str>
<str name="mm.autoRelax">true</str>
...
解析器和提升可能在这里发挥关键作用。
我不熟悉 edixmax 解析器,请检查:documentation
我猜 mm
参数可能是造成这种情况的原因。
不管怎样,这很奇怪,OR 不能像我们从布尔代数中使用的那样工作。
"debug":{
"queryBoosting":{
"q":"title:Home OR content:Perfekt",
"match":null},
"rawquerystring":"title:Home OR content:Perfekt",
"querystring":"title:Home OR content:Perfekt",
"parsedquery":"+(title:hom content:perfekt)~2 ()",
"parsedquery_toString":"+((title:hom content:perfekt)~2) ()",
"explain":{
"bf72a75534ba703e4b8dc7194f92ce34223fc0d2/pages/1/0/0/0":"\n4.8893824 = sum of:\n 4.8893824 = sum of:\n 1.9924302 = weight(title:hom in 0) [SchemaSimilarity], result of:\n 1.9924302 = score(doc=0,freq=1.0 = termFreq=1.0\n), product of:\n 1.9924302 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 1.0 = docFreq\n 10.0 = docCount\n 1.0 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1) from:\n 1.0 = termFreq=1.0\n 1.2 = parameter k1\n 0.0 = parameter b (norms omitted for field)\n 2.8969522 = weight(content:perfekt in 0) [SchemaSimilarity], result of:\n 2.8969522 = score(doc=0,freq=5.0 = termFreq=5.0\n), product of:\n 1.4816046 = idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:\n 2.0 = docFreq\n 10.0 = docCount\n 1.9552802 = tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:\n 5.0 = termFreq=5.0\n 1.2 = parameter k1\n 0.75 = parameter b\n 508.3 = avgFieldLength\n 184.0 = fieldLength\n"},
"QParser":"ExtendedDismaxQParser",
检查 "parsedquery":"+(title:hom content:perfekt)~2 ()"
它基本上说,标题和内容都必须存在:
Solr operators