减少sidekiq作业的执行时间
Reduce the execution time of jobs of sidekiq
我目前正在开发一个涉及在 rails 服务器上同步联系人的应用程序。我正在使用 redis 服务器和 sidekiq 在后台执行联系人同步。我的数据库是 mongodb 并且我使用 mongoid gem 作为 ORM。工作流程如下:
- phone上的联系人通过app传递给rails服务器,然后在rails服务器上,在redis服务器中排队。
- 现在 cron 作业触发连接到 redis 并完成作业的 sidekiq。
sidekiq的一个Job如下:
- 它有联系人数组(最多 3000 个)。
- 它必须处理这些联系人中的每一个。通过处理,我的意思是对数据库进行插入查询。
现在的问题是 sidekiq 花费了大量的时间来完成这项工作。完成这项工作平均需要 50-70 秒。
相关文件如下
sidekiq.yml
# Sample configuration file for Sidekiq.
# Options here can still be overridden by cmd line args.
# sidekiq -C config.yml
:verbose: true
:concurrency: 5
:logfile: ./log/sidekiq.log
:pidfile: ./tmp/pids/sidekiq.pid
:queues:
- [new_wall, 1]#6
- [contact_wall, 1]#7
- [email, 1]#5
- [gcm_chat, 1]#5
- [contact_address, 1]#7
- [backlog_contact_address, 5]
- [comment, 7]
- [default, 5]
mongoid.yml
development:
# Configure available database sessions. (required)
sessions:
# Defines the default session. (required)
default:
# Defines the name of the default database that Mongoid can connect to.
# (required).
database: "<%= ENV['DB_NAME']%>"
# Provides the hosts the default session can connect to. Must be an array
# of host:port pairs. (required)
hosts:
- "<%=ENV['MONGOD_URL']%>"
#username: "<%= ENV['DB_USERNAME']%>"
#password: "<%= ENV['DB_PASSWORD']%>"
options:
#pool: 12
# Change the default write concern. (default = { w: 1 })
# write:
# w: 1
# Change the default consistency model to primary, secondary.
# 'secondary' will send reads to secondaries, 'primary' sends everything
# to master. (default: primary)
# read: secondary_preferred
# How many times Moped should attempt to retry an operation after
# failure. (default: The number of nodes in the cluster)
# max_retries: 20
# The time in seconds that Moped should wait before retrying an
# operation on failure. (default: 0.25)
# retry_interval: 0.25
# Configure Mongoid specific options. (optional)
options:
# Includes the root model name in json serialization. (default: false)
# include_root_in_json: false
# Include the _type field in serializaion. (default: false)
# include_type_for_serialization: false
# Preload all models in development, needed when models use
# inheritance. (default: false)
# preload_models: false
# Protect id and type from mass assignment. (default: true)
# protect_sensitive_fields: true
# Raise an error when performing a #find and the document is not found.
# (default: true)
# raise_not_found_error: true
# Raise an error when defining a scope with the same name as an
# existing method. (default: false)
# scope_overwrite_exception: false
# Use Active Support's time zone in conversions. (default: true)
# use_activesupport_time_zone: true
# Ensure all times are UTC in the app side. (default: false)
# use_utc: false
test:
sessions:
default:
database: db_test
hosts:
- localhost:27017
options:
read: primary
# In the test environment we lower the retries and retry interval to
# low amounts for fast failures.
max_retries: 1
retry_interval: 0
production:
# Configure available database sessions. (required)
sessions:
# Defines the default session. (required)
default:
# Defines the name of the default database that Mongoid can connect to.
# (required).
database: "<%= ENV['DB_NAME']%>"
# Provides the hosts the default session can connect to. Must be an array
# of host:port pairs. (required)
hosts:
- "<%=ENV['MONGOD_URL']%>"
username: "<%= ENV['DB_USERNAME']%>"
password: "<%= ENV['DB_PASSWORD']%>"
pool: 10
options:
# Configure Mongoid specific options. (optional)
options:
Model.rb
def retry_save_contact_dump(c_dump_id)
c_dump = ContactDump.where(_id: c_dump_id, status: ContactDump::CONTACT_DUMP_CONS[:ERROR]).first
return false if c_dump.blank?
user = User.where(_id: c_dump.user_id).first
puts "retry_save_contact_dump"
user.save_contacts_with_name(c_dump.contacts)
c_dump.status = ContactDump::CONTACT_DUMP_CONS[:PROCESSED]
c_dump.error_msg = ""
c_dump.save
rescue => e
c_dump.status = ContactDump::CONTACT_DUMP_CONS[:CANTSYNC]
c_dump.error_msg = e.message
c_dump.save
end
def save_contacts_with_name(c_array)
m_num = Person.get_number_digest(self.mobile_number.to_s)
c_array.each do |n|
next if m_num == n["hash_mobile_number"]
p = Person.where(h_m_num: n["hash_mobile_number"]).first_or_create
save_friend(p) #if p.persisted?
p.c_names.create(name: n["name"], user_id: self.id)
end
end
ContactDump.rb
class ContactDump
include Mongoid::Document
include Mongoid::Timestamps::Created
include Mongoid::Timestamps::Updated
field :contacts, type: Array
field :status, type: Integer, default: 0
field :user_id, type: BSON::ObjectId
field :error_msg, type: String
CONTACT_DUMP_CONS = {FRESH: 0, PROCESSED: 1, ERROR: 2, CANTSYNC: 3}
end
如何加快作业的处理速度?我尝试在 sidekiq.yml 和池 mongoid.yml 中增加 sidekiq 的并发排列,但没有帮助。
whatsApp 和其他消息应用程序如何处理联系人同步?
如果需要其他信息,请询问。谢谢。
编辑:如果无法回答这个问题,谁能建议我使用其他方法来同步 rails 服务器上的联系人。
索引来拯救。
class ContactDump
index({status: 1})
end
class Person
index({h_m_num: 1})
end
Person
可能需要更多索引,具体取决于您的 Person.get_number_digest
所做的事情。
添加索引后运行
rake db:mongoid:create_indexes
此外,请删除 puts
,你不需要在你的 worker 上使用它,而且 puts 会严重影响你的性能,即使你看不到输出!
我目前正在开发一个涉及在 rails 服务器上同步联系人的应用程序。我正在使用 redis 服务器和 sidekiq 在后台执行联系人同步。我的数据库是 mongodb 并且我使用 mongoid gem 作为 ORM。工作流程如下:
- phone上的联系人通过app传递给rails服务器,然后在rails服务器上,在redis服务器中排队。
- 现在 cron 作业触发连接到 redis 并完成作业的 sidekiq。
sidekiq的一个Job如下:
- 它有联系人数组(最多 3000 个)。
- 它必须处理这些联系人中的每一个。通过处理,我的意思是对数据库进行插入查询。
现在的问题是 sidekiq 花费了大量的时间来完成这项工作。完成这项工作平均需要 50-70 秒。
相关文件如下
sidekiq.yml
# Sample configuration file for Sidekiq.
# Options here can still be overridden by cmd line args.
# sidekiq -C config.yml
:verbose: true
:concurrency: 5
:logfile: ./log/sidekiq.log
:pidfile: ./tmp/pids/sidekiq.pid
:queues:
- [new_wall, 1]#6
- [contact_wall, 1]#7
- [email, 1]#5
- [gcm_chat, 1]#5
- [contact_address, 1]#7
- [backlog_contact_address, 5]
- [comment, 7]
- [default, 5]
mongoid.yml
development:
# Configure available database sessions. (required)
sessions:
# Defines the default session. (required)
default:
# Defines the name of the default database that Mongoid can connect to.
# (required).
database: "<%= ENV['DB_NAME']%>"
# Provides the hosts the default session can connect to. Must be an array
# of host:port pairs. (required)
hosts:
- "<%=ENV['MONGOD_URL']%>"
#username: "<%= ENV['DB_USERNAME']%>"
#password: "<%= ENV['DB_PASSWORD']%>"
options:
#pool: 12
# Change the default write concern. (default = { w: 1 })
# write:
# w: 1
# Change the default consistency model to primary, secondary.
# 'secondary' will send reads to secondaries, 'primary' sends everything
# to master. (default: primary)
# read: secondary_preferred
# How many times Moped should attempt to retry an operation after
# failure. (default: The number of nodes in the cluster)
# max_retries: 20
# The time in seconds that Moped should wait before retrying an
# operation on failure. (default: 0.25)
# retry_interval: 0.25
# Configure Mongoid specific options. (optional)
options:
# Includes the root model name in json serialization. (default: false)
# include_root_in_json: false
# Include the _type field in serializaion. (default: false)
# include_type_for_serialization: false
# Preload all models in development, needed when models use
# inheritance. (default: false)
# preload_models: false
# Protect id and type from mass assignment. (default: true)
# protect_sensitive_fields: true
# Raise an error when performing a #find and the document is not found.
# (default: true)
# raise_not_found_error: true
# Raise an error when defining a scope with the same name as an
# existing method. (default: false)
# scope_overwrite_exception: false
# Use Active Support's time zone in conversions. (default: true)
# use_activesupport_time_zone: true
# Ensure all times are UTC in the app side. (default: false)
# use_utc: false
test:
sessions:
default:
database: db_test
hosts:
- localhost:27017
options:
read: primary
# In the test environment we lower the retries and retry interval to
# low amounts for fast failures.
max_retries: 1
retry_interval: 0
production:
# Configure available database sessions. (required)
sessions:
# Defines the default session. (required)
default:
# Defines the name of the default database that Mongoid can connect to.
# (required).
database: "<%= ENV['DB_NAME']%>"
# Provides the hosts the default session can connect to. Must be an array
# of host:port pairs. (required)
hosts:
- "<%=ENV['MONGOD_URL']%>"
username: "<%= ENV['DB_USERNAME']%>"
password: "<%= ENV['DB_PASSWORD']%>"
pool: 10
options:
# Configure Mongoid specific options. (optional)
options:
Model.rb
def retry_save_contact_dump(c_dump_id)
c_dump = ContactDump.where(_id: c_dump_id, status: ContactDump::CONTACT_DUMP_CONS[:ERROR]).first
return false if c_dump.blank?
user = User.where(_id: c_dump.user_id).first
puts "retry_save_contact_dump"
user.save_contacts_with_name(c_dump.contacts)
c_dump.status = ContactDump::CONTACT_DUMP_CONS[:PROCESSED]
c_dump.error_msg = ""
c_dump.save
rescue => e
c_dump.status = ContactDump::CONTACT_DUMP_CONS[:CANTSYNC]
c_dump.error_msg = e.message
c_dump.save
end
def save_contacts_with_name(c_array)
m_num = Person.get_number_digest(self.mobile_number.to_s)
c_array.each do |n|
next if m_num == n["hash_mobile_number"]
p = Person.where(h_m_num: n["hash_mobile_number"]).first_or_create
save_friend(p) #if p.persisted?
p.c_names.create(name: n["name"], user_id: self.id)
end
end
ContactDump.rb
class ContactDump
include Mongoid::Document
include Mongoid::Timestamps::Created
include Mongoid::Timestamps::Updated
field :contacts, type: Array
field :status, type: Integer, default: 0
field :user_id, type: BSON::ObjectId
field :error_msg, type: String
CONTACT_DUMP_CONS = {FRESH: 0, PROCESSED: 1, ERROR: 2, CANTSYNC: 3}
end
如何加快作业的处理速度?我尝试在 sidekiq.yml 和池 mongoid.yml 中增加 sidekiq 的并发排列,但没有帮助。
whatsApp 和其他消息应用程序如何处理联系人同步?
如果需要其他信息,请询问。谢谢。
编辑:如果无法回答这个问题,谁能建议我使用其他方法来同步 rails 服务器上的联系人。
索引来拯救。
class ContactDump
index({status: 1})
end
class Person
index({h_m_num: 1})
end
Person
可能需要更多索引,具体取决于您的 Person.get_number_digest
所做的事情。
添加索引后运行
rake db:mongoid:create_indexes
此外,请删除 puts
,你不需要在你的 worker 上使用它,而且 puts 会严重影响你的性能,即使你看不到输出!