Elasticsearch Ruby Activerecord 持久性模型 URL 术语搜索
Elasticsearch Ruby Activerecord Persistence Model URL term search
我正在尝试使用弹性搜索字词查询对包含 URL 的字段进行搜索。我使用 elasticsearch-rails ActiveRecord 持久性模式。这就是我尝试做的方式。
total_views = UserAction.search :query=> {
:filtered=> {
:filter=> {
:term=> { action_path:"http://0.0.0.0:3000/tshirt/test" }
}
}
}
如果没有“/”或“:”字符,它会起作用。例如,当 action_path 只是 'tshirt' 时。不分析其他字段,如果字段中没有“/”、“:”等字符,它们将起作用。
所以显然弹性搜索试图分析它,但问题是它们不应该被分析,因为映射已经存在。
这是我的用户操作 class
class UserAction
include Elasticsearch::Persistence::Model
extend Calculations
include Styles
attribute :user_id, Integer
attribute :user_referrer, String, mapping: { index: 'not_analyzed' }
attribute :user_ip, String, mapping: { index: 'not_analyzed' }
attribute :user_country, String, mapping: { index: 'not_analyzed' }
attribute :user_city, String, mapping: { index: 'not_analyzed' }
attribute :user_device, String, mapping: { index: 'not_analyzed' }
attribute :user_agent, String, mapping: { index: 'not_analyzed' }
attribute :user_platform
attribute :user_visitid, Integer
attribute :action_type, String, mapping: { index: 'not_analyzed' }
attribute :action_css, String, mapping: { index: 'not_analyzed' }
attribute :action_text, String, mapping: { index: 'not_analyzed' }
attribute :action_path, String, mapping: { index: 'not_analyzed' }
attribute :share_url, String, mapping: { index: 'not_analyzed' }
attribute :tag
attribute :date
我还尝试使用“mapping do..”添加索引,然后 "create_index!" 但结果是一样的。因为存在映射,所以它确实创建了映射。
这是我的 gem 文件
gem "elasticsearch-model", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/model"
gem "elasticsearch-persistence", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/persistence/model"
gem "elasticsearch-rails"
当我进行搜索时,我还看到那些未分析的字段。
:reload_on_failure=>false,
:randomize_hosts=>false,
:transport_options=>{}},
@protocol="http",
@reload_after=10000,
@resurrect_after=60,
@serializer=
#<Elasticsearch::Transport::Transport::Serializer::MultiJson:0x007fc4bf9e0e18
@transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
@sniffer=
#<Elasticsearch::Transport::Transport::Sniffer:0x007fc4bf9e0dc8
@timeout=1,
@transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
@tracer=nil>>,
@document_type="user_action",
@index_name="useraction",
@klass=UserAction,
@mapping=
#<Elasticsearch::Model::Indexing::Mappings:0x007fc4bfab18d8
@mapping=
{:created_at=>{:type=>"date"},
:updated_at=>{:type=>"date"},
:user_id=>{:type=>"integer"},
:user_referrer=>{:type=>"string"},
:user_ip=>{:type=>"string"},
:user_country=>{:type=>"string", :index=>"not_analyzed"},
:user_city=>{:type=>"string", :index=>"not_analyzed"},
:user_device=>{:type=>"string", :index=>"not_analyzed"},
:user_agent=>{:type=>"string", :index=>"not_analyzed"},
:user_platform=>{:type=>"string"},
:user_visitid=>{:type=>"integer"},
:action_type=>{:type=>"string", :index=>"not_analyzed"},
:action_css=>{:type=>"string", :index=>"not_analyzed"},
:action_text=>{:type=>"string", :index=>"not_analyzed"},
:action_path=>{:type=>"string", :index=>"not_analyzed"}},
@options={},
@type="user_action">,
@options={:host=>UserAction}>,
@response={"took"=>1, "timed_out"=>false, "_shards"=>{"total"=>4, "successful"=>4, "failed"=>0}, "hits"=>{"total"=>0, "max_score"=>nil, "hits"=>[]}}>
(END)
初始化文件除了 elastichq 连接外什么都没有 url。
数据在 elastichq 中,所以我应该得到结果但无法得到任何结果。
user_action 1 AUzH9xKDueQ8OtBQuyQC http://example.org/api/analytics/track
user_actions user_action 1 AUzIAUsvueQ8OtBQuyQg http://0.0.0.0:3000/tshirt/funnel_test2
user_actions user_action 1 AUzH7ay5ueQ8OtBQuyP2 http://example.org/api/analytics/track
user_actions user_action 1 AUzH-HAdueQ8OtBQuyQU http://0.0.0.0:3000/tshirt/test
user_actions user_action 1 AUzIJbCGueQ8OtBQuyQ4 http://example.org/api/analytics/track
user_actions user_action 1 AUzIJbCjueQ8OtBQuyQ5 http://example.org/api/analytics/track
Curl 来自 Elastichq 的结果
curl -XGET "https://YYYYY:XXXXX@xxxx.qbox.io/user_actions/_mapping"
{
"user_actions": {
"mappings": {
"user_action": {
"properties": {
"action_css": { "type": "string" },
"action_path": { "type": "string" },
"action_text": { "type": "string" },
"action_type": { "type": "string" },
"created_at": { "format": "dateOptionalTime", "type": "date" },
"date": { "type": "string" },
"share_url": { "type": "string" },
"tag": { "type": "string" },
"updated_at": { "format": "dateOptionalTime", "type": "date" },
"user_agent": { "type": "string" },
"user_city": { "type": "string" },
"user_country": { "type": "string" },
"user_device": { "type": "string" },
"user_id": { "type": "long" },
"user_ip": { "type": "string" },
"user_referrer": { "type": "string" },
"user_visitid": { "type": "long" }
}
}
}
}
}
任何人都可以帮助我获得 url 术语搜索工作吗?
从末尾的 elasticsearch curl 看来,您的字段已被分析(没有 not_analyzed
标志)。也许尝试用你想要的映射重建你的索引。
根据经验,如果你想搜索某些东西,你不应该离开它 not_analyzed
。
在这种情况下,您绝对应该尝试 Keyword Analyzer,将相关字段映射到 keyword
。
只要您搜索完整的字符串,即 "http://0.0.0.0:3000/tshirt/test"
,使用 Keyword Analyzer 很有可能会成功。
尝试原始查询:
total_views = UserAction.search :query=> {
:filtered=> {
:filter=> {
:term=> { "action_path.raw" => "http://0.0.0.0:3000/tshirt/test" }
}
}
}
我做了我不想做的事。
使用以下 post 请求手动创建索引及其映射,因此 elasticsearch-rails 不会创建错误。现在一切正常
curl -XPOST https://xxxxxx.qbox.io/user_actions -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"user_action" : {
"_source" : { "enabled" : false },
"properties" : {
"action_path" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}'
我正在尝试使用弹性搜索字词查询对包含 URL 的字段进行搜索。我使用 elasticsearch-rails ActiveRecord 持久性模式。这就是我尝试做的方式。
total_views = UserAction.search :query=> {
:filtered=> {
:filter=> {
:term=> { action_path:"http://0.0.0.0:3000/tshirt/test" }
}
}
}
如果没有“/”或“:”字符,它会起作用。例如,当 action_path 只是 'tshirt' 时。不分析其他字段,如果字段中没有“/”、“:”等字符,它们将起作用。 所以显然弹性搜索试图分析它,但问题是它们不应该被分析,因为映射已经存在。
这是我的用户操作 class
class UserAction
include Elasticsearch::Persistence::Model
extend Calculations
include Styles
attribute :user_id, Integer
attribute :user_referrer, String, mapping: { index: 'not_analyzed' }
attribute :user_ip, String, mapping: { index: 'not_analyzed' }
attribute :user_country, String, mapping: { index: 'not_analyzed' }
attribute :user_city, String, mapping: { index: 'not_analyzed' }
attribute :user_device, String, mapping: { index: 'not_analyzed' }
attribute :user_agent, String, mapping: { index: 'not_analyzed' }
attribute :user_platform
attribute :user_visitid, Integer
attribute :action_type, String, mapping: { index: 'not_analyzed' }
attribute :action_css, String, mapping: { index: 'not_analyzed' }
attribute :action_text, String, mapping: { index: 'not_analyzed' }
attribute :action_path, String, mapping: { index: 'not_analyzed' }
attribute :share_url, String, mapping: { index: 'not_analyzed' }
attribute :tag
attribute :date
我还尝试使用“mapping do..”添加索引,然后 "create_index!" 但结果是一样的。因为存在映射,所以它确实创建了映射。
这是我的 gem 文件
gem "elasticsearch-model", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/model"
gem "elasticsearch-persistence", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/persistence/model"
gem "elasticsearch-rails"
当我进行搜索时,我还看到那些未分析的字段。
:reload_on_failure=>false,
:randomize_hosts=>false,
:transport_options=>{}},
@protocol="http",
@reload_after=10000,
@resurrect_after=60,
@serializer=
#<Elasticsearch::Transport::Transport::Serializer::MultiJson:0x007fc4bf9e0e18
@transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
@sniffer=
#<Elasticsearch::Transport::Transport::Sniffer:0x007fc4bf9e0dc8
@timeout=1,
@transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
@tracer=nil>>,
@document_type="user_action",
@index_name="useraction",
@klass=UserAction,
@mapping=
#<Elasticsearch::Model::Indexing::Mappings:0x007fc4bfab18d8
@mapping=
{:created_at=>{:type=>"date"},
:updated_at=>{:type=>"date"},
:user_id=>{:type=>"integer"},
:user_referrer=>{:type=>"string"},
:user_ip=>{:type=>"string"},
:user_country=>{:type=>"string", :index=>"not_analyzed"},
:user_city=>{:type=>"string", :index=>"not_analyzed"},
:user_device=>{:type=>"string", :index=>"not_analyzed"},
:user_agent=>{:type=>"string", :index=>"not_analyzed"},
:user_platform=>{:type=>"string"},
:user_visitid=>{:type=>"integer"},
:action_type=>{:type=>"string", :index=>"not_analyzed"},
:action_css=>{:type=>"string", :index=>"not_analyzed"},
:action_text=>{:type=>"string", :index=>"not_analyzed"},
:action_path=>{:type=>"string", :index=>"not_analyzed"}},
@options={},
@type="user_action">,
@options={:host=>UserAction}>,
@response={"took"=>1, "timed_out"=>false, "_shards"=>{"total"=>4, "successful"=>4, "failed"=>0}, "hits"=>{"total"=>0, "max_score"=>nil, "hits"=>[]}}>
(END)
初始化文件除了 elastichq 连接外什么都没有 url。
数据在 elastichq 中,所以我应该得到结果但无法得到任何结果。
user_action 1 AUzH9xKDueQ8OtBQuyQC http://example.org/api/analytics/track
user_actions user_action 1 AUzIAUsvueQ8OtBQuyQg http://0.0.0.0:3000/tshirt/funnel_test2
user_actions user_action 1 AUzH7ay5ueQ8OtBQuyP2 http://example.org/api/analytics/track
user_actions user_action 1 AUzH-HAdueQ8OtBQuyQU http://0.0.0.0:3000/tshirt/test
user_actions user_action 1 AUzIJbCGueQ8OtBQuyQ4 http://example.org/api/analytics/track
user_actions user_action 1 AUzIJbCjueQ8OtBQuyQ5 http://example.org/api/analytics/track
Curl 来自 Elastichq 的结果
curl -XGET "https://YYYYY:XXXXX@xxxx.qbox.io/user_actions/_mapping"
{
"user_actions": {
"mappings": {
"user_action": {
"properties": {
"action_css": { "type": "string" },
"action_path": { "type": "string" },
"action_text": { "type": "string" },
"action_type": { "type": "string" },
"created_at": { "format": "dateOptionalTime", "type": "date" },
"date": { "type": "string" },
"share_url": { "type": "string" },
"tag": { "type": "string" },
"updated_at": { "format": "dateOptionalTime", "type": "date" },
"user_agent": { "type": "string" },
"user_city": { "type": "string" },
"user_country": { "type": "string" },
"user_device": { "type": "string" },
"user_id": { "type": "long" },
"user_ip": { "type": "string" },
"user_referrer": { "type": "string" },
"user_visitid": { "type": "long" }
}
}
}
}
}
任何人都可以帮助我获得 url 术语搜索工作吗?
从末尾的 elasticsearch curl 看来,您的字段已被分析(没有 not_analyzed
标志)。也许尝试用你想要的映射重建你的索引。
根据经验,如果你想搜索某些东西,你不应该离开它 not_analyzed
。
在这种情况下,您绝对应该尝试 Keyword Analyzer,将相关字段映射到 keyword
。
只要您搜索完整的字符串,即 "http://0.0.0.0:3000/tshirt/test"
,使用 Keyword Analyzer 很有可能会成功。
尝试原始查询:
total_views = UserAction.search :query=> {
:filtered=> {
:filter=> {
:term=> { "action_path.raw" => "http://0.0.0.0:3000/tshirt/test" }
}
}
}
我做了我不想做的事。 使用以下 post 请求手动创建索引及其映射,因此 elasticsearch-rails 不会创建错误。现在一切正常
curl -XPOST https://xxxxxx.qbox.io/user_actions -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"user_action" : {
"_source" : { "enabled" : false },
"properties" : {
"action_path" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}'