使用 Wicked_PDF gem 在 Rails 中生成 PDF 时使用率高 CPU
High CPU Usage when generating PDFs in Rails with Wicked_PDF gem
我正在尝试使用 Rails 生成 PDF 文件,但是当我这样做时,我注意到我的系统 CPU 开始达到最大值。最初,它会从 ~2.5% 上升到 ~65%-$80% 一段稳定的时间,然后最后几乎在我的页面上的 iframe 中显示 PDF 之前达到最大值。以下是我在监视系统内存使用情况时收到的一些消息:
Warning or critical alerts (lasts 9 entries)
2017-06-09 14:58:07 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:58:04 (0:00:13) - CRITICAL on CPU_USER (Min:72.8 Mean:83.3 Max:93.7)
2017-06-09 14:47:39 (0:00:06) - CRITICAL on CPU_USER (93.0)
2017-06-09 14:47:29 (0:00:04) - WARNING on CPU_SYSTEM (74.7)
2017-06-09 14:36:48 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:36:45 (0:00:10) - CRITICAL on CPU_IOWAIT (Min:78.6 Mean:85.7 Max:97.4)
2017-06-09 14:18:06 (0:00:04) - CRITICAL on CPU_SYSTEM (94.3)
2017-06-09 14:18:06 (0:00:07) - CRITICAL on CPU_USER (91.0)
2017-06-09 15:01:14 2017-06-09 14:17:44 (0:00:04) - WARNING on CPU_SYSTEM (73.8)
我为 PDF 生成安装的 gem 是 wicked_pdf (1.0.6)
和 wkhtmltopdf-binary-edge (0.12.4.0)
。并且每个过程的代码如下:
controllers/concerns/pdf_player_reports.rb
def director_report_pdf
@players = Player.where(id: params["player_ids"]
respond_to do |format|
format.html
format.pdf do
render pdf: "#{params['pdf_title']}",
template: 'players/director_summary_report.pdf.erb',
layout: 'print',
show_as_html: params.key?('debug'),
window_status: 'Loading...',
disable_internal_links: true,
disable_external_links: true,
dpi: 75,
disable_javascript: true,
:margin => {:top => 7, :bottom => 7, :left => 6, :right => 0},
encoding: 'utf8'
end
end
players/director_summary_report.pdf.erb
<div class="document" style="margin-top: -63px;">
<% @players.each do |player| %>
<% reports = player.reports.order(created_at: :desc) %>
<% if player.is_college_player? %>
<%= render partial: 'college_director_report.html.erb', player: player %>
<% else %>
<%= render partial: 'pro_director_report.html.erb', player: player %>
<% end %>
<%= "<div class='page-break'></div>".html_safe %>
<% end %>
</div>
college_director_report.html.erb
<%= wicked_pdf_stylesheet_link_tag "application", media: "all" %>
<%= wicked_pdf_javascript_include_tag "application" %>
<% provide(:title, "#{player.football_name}") %>
<% self.formats = [:html, :pdf, :css, :coffee, :scss] %>
<style>
thead { display: table-row-group; page-break-inside: avoid }
tfoot { display: table-row-group; }
/*thead:before, thead:after { display: none; }*/
table { page-break-inside: avoid; }
tr { page-break-inside: avoid; }
.page-break {
display:block; clear:both; page-break-after:always;
}
.keep-together { page-break-before: always !important; }
.table-striped>tbody>tr:nth-child(odd)>td,
tr.found{
background-color:#e2e0e0 !important;
}
</style>
<div class="row">
<div class="col-xs-6">
<span>DIRECTOR SUMMARY</span>
</div>
<div class="col-xs-6 text-right">
<%= "#{player.full_name} / #{player.school.short_name}".upcase %>
<h1><%= "#{player.full_name(true)} (#{player.school.code})".upcase %></h1>
</div>
</div>
<div class="row">
<div class="col-xs-12">
<%= render 'directors_report_player_header', player: player %>
<%= render 'directors_report_workouts', player: player %>
<%= render 'directors_report_grades', player: player %>
<%= render 'legacy_directors_report_contacts', player: player %>
</div>
</div>
directors_report_player_header.html.erb
<table class="table table-condensed table-bordered">
<thead>
<tr>
<th>Name</th>
<th>School</th>
<th>#</th>
<th>Position</th>
</tr>
</thead>
<tbody>
<tr>
<td><%= player.full_name(true) %></td>
<td><%= player.school.short_name %></td>
<td><%= player.jersey %></td>
<td><%= player.position.abbreviation %></td>
</tr>
</tbody>
</table>
更新
我 运行 使用以下内容的示例 PDF 生成器和 CPU% 最终达到最大值,如下所示...
<table class="table table-condensed">
<thead>
<th>Number</th>
</thead>
<tbody>
<% (1..60000).each do |number| %>
<tr>
<td><%= number %></td>
</tr>
<% end %>
</tbody>
</table>
将此放在控制器中似乎是不明智的,因为在部署此请求的那一刻,请求将花费大量时间来生成和阻止对其他页面的其他传入请求。
您应该将其分为两个问题。生成 HTML 的一项作业可能是此控制器,然后是将 HTML 转换为 PDF 格式的后台任务。
在您的控制器中,使用 DelayedJob 或类似工具触发作业,然后呈现一个页面以轮询已完成的作业。
然后在您的后台作业中,您处理的只是将 HTML 呈现为 PDF 的任务,而不是在 Web 请求中。沿着这些线的东西:
class RendersReportPdf
def self.call player_ids
html = ReportsController.render :director_report_pdf, assigns: { players: Player.where(id: player_ids }
pdf = WickedPdf.new.pdf_from_string html
temp = Tempfile.new("#{Time.now.to_i}.pdf")
temp.write(pdf)
temp.close
temp.path
# Probably upload this to S3 or similar at this point
# Notify the user that it's now available somehow
end
end
如果你这样做,那么你可以从你的控制器操作中排除 运行 WickedPDF 的问题,但你也要确保你的网站在长时间 - 运行 请求。
所以我想 post 我的解决方案供未来的访客使用,但它基于 @stef 的解决方案 - 谢谢 stef!
controllers/concerns/players_controller.rb
def generate_report_pdf
players = print_settings(params)
pdf_title = "#{params['pdf_title']} - #{Time.now.strftime("%c")}"
GeneratePdfJob.perform_later(players.pluck(:id), pdf_title, current_user.code, params["format"])
end
app/jobs/generate_pdf_job.rb
def perform(*args)
player_ids = args[0]
pdf_title = args[1]
user_code = args[2]
report_type = args[3]
generate_pdf_document(player_ids, pdf_title, user_code, report_type)
end
def generate_pdf_document(ids, pdf_title, user_code, report_type)
# select the proper template by the report type specified
case report_type
when "Labels"
html = ApplicationController.new.render_to_string(
template: 'players/board_labels.pdf.erb',
locals: { player_ids: ids },
margin: { top: 6, bottom: 0, left: 32, right: 32 }
)
when "Reports"
# ...
end
end
def save_to_pdf(html, pdf_title, user_code)
pdf = WickedPdf.new.pdf_from_string(
html,
pdf: "#{pdf_title}",
layout: 'print',
disable_internal_links: true,
disable_external_links: true,
disable_javascript: true,
encoding: 'utf-8'
)
pdf_name = "#{pdf_title}.pdf"
pdf_dir = Rails.root.join('public','uploads','reports',"#{user_code}")
pdf_path = Rails.root.join(pdf_dir,pdf_name)
# create the folder if it doesn't exist
FileUtils.mkdir_p(pdf_dir) unless File.directory?(pdf_dir)
# create a new file
File.open(pdf_path,'wb') do |file|
file.binmode
file << pdf.force_encoding("UTF-8")
end
end
通过这种方式,我然后使用 ajax 调用来继续检查用户指定目录中的新文件,并更新列出目录中文件的部分内容。我唯一不喜欢的是现在我必须有一个 table 用户文件列表。我宁愿将文件传送到客户端的浏览器进行下载 - 但还没有弄清楚如何让它工作。
我正在尝试使用 Rails 生成 PDF 文件,但是当我这样做时,我注意到我的系统 CPU 开始达到最大值。最初,它会从 ~2.5% 上升到 ~65%-$80% 一段稳定的时间,然后最后几乎在我的页面上的 iframe 中显示 PDF 之前达到最大值。以下是我在监视系统内存使用情况时收到的一些消息:
Warning or critical alerts (lasts 9 entries)
2017-06-09 14:58:07 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:58:04 (0:00:13) - CRITICAL on CPU_USER (Min:72.8 Mean:83.3 Max:93.7)
2017-06-09 14:47:39 (0:00:06) - CRITICAL on CPU_USER (93.0)
2017-06-09 14:47:29 (0:00:04) - WARNING on CPU_SYSTEM (74.7)
2017-06-09 14:36:48 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
2017-06-09 14:36:45 (0:00:10) - CRITICAL on CPU_IOWAIT (Min:78.6 Mean:85.7 Max:97.4)
2017-06-09 14:18:06 (0:00:04) - CRITICAL on CPU_SYSTEM (94.3)
2017-06-09 14:18:06 (0:00:07) - CRITICAL on CPU_USER (91.0)
2017-06-09 15:01:14 2017-06-09 14:17:44 (0:00:04) - WARNING on CPU_SYSTEM (73.8)
我为 PDF 生成安装的 gem 是 wicked_pdf (1.0.6)
和 wkhtmltopdf-binary-edge (0.12.4.0)
。并且每个过程的代码如下:
controllers/concerns/pdf_player_reports.rb
def director_report_pdf
@players = Player.where(id: params["player_ids"]
respond_to do |format|
format.html
format.pdf do
render pdf: "#{params['pdf_title']}",
template: 'players/director_summary_report.pdf.erb',
layout: 'print',
show_as_html: params.key?('debug'),
window_status: 'Loading...',
disable_internal_links: true,
disable_external_links: true,
dpi: 75,
disable_javascript: true,
:margin => {:top => 7, :bottom => 7, :left => 6, :right => 0},
encoding: 'utf8'
end
end
players/director_summary_report.pdf.erb
<div class="document" style="margin-top: -63px;">
<% @players.each do |player| %>
<% reports = player.reports.order(created_at: :desc) %>
<% if player.is_college_player? %>
<%= render partial: 'college_director_report.html.erb', player: player %>
<% else %>
<%= render partial: 'pro_director_report.html.erb', player: player %>
<% end %>
<%= "<div class='page-break'></div>".html_safe %>
<% end %>
</div>
college_director_report.html.erb
<%= wicked_pdf_stylesheet_link_tag "application", media: "all" %>
<%= wicked_pdf_javascript_include_tag "application" %>
<% provide(:title, "#{player.football_name}") %>
<% self.formats = [:html, :pdf, :css, :coffee, :scss] %>
<style>
thead { display: table-row-group; page-break-inside: avoid }
tfoot { display: table-row-group; }
/*thead:before, thead:after { display: none; }*/
table { page-break-inside: avoid; }
tr { page-break-inside: avoid; }
.page-break {
display:block; clear:both; page-break-after:always;
}
.keep-together { page-break-before: always !important; }
.table-striped>tbody>tr:nth-child(odd)>td,
tr.found{
background-color:#e2e0e0 !important;
}
</style>
<div class="row">
<div class="col-xs-6">
<span>DIRECTOR SUMMARY</span>
</div>
<div class="col-xs-6 text-right">
<%= "#{player.full_name} / #{player.school.short_name}".upcase %>
<h1><%= "#{player.full_name(true)} (#{player.school.code})".upcase %></h1>
</div>
</div>
<div class="row">
<div class="col-xs-12">
<%= render 'directors_report_player_header', player: player %>
<%= render 'directors_report_workouts', player: player %>
<%= render 'directors_report_grades', player: player %>
<%= render 'legacy_directors_report_contacts', player: player %>
</div>
</div>
directors_report_player_header.html.erb
<table class="table table-condensed table-bordered">
<thead>
<tr>
<th>Name</th>
<th>School</th>
<th>#</th>
<th>Position</th>
</tr>
</thead>
<tbody>
<tr>
<td><%= player.full_name(true) %></td>
<td><%= player.school.short_name %></td>
<td><%= player.jersey %></td>
<td><%= player.position.abbreviation %></td>
</tr>
</tbody>
</table>
更新
我 运行 使用以下内容的示例 PDF 生成器和 CPU% 最终达到最大值,如下所示...
<table class="table table-condensed">
<thead>
<th>Number</th>
</thead>
<tbody>
<% (1..60000).each do |number| %>
<tr>
<td><%= number %></td>
</tr>
<% end %>
</tbody>
</table>
将此放在控制器中似乎是不明智的,因为在部署此请求的那一刻,请求将花费大量时间来生成和阻止对其他页面的其他传入请求。
您应该将其分为两个问题。生成 HTML 的一项作业可能是此控制器,然后是将 HTML 转换为 PDF 格式的后台任务。
在您的控制器中,使用 DelayedJob 或类似工具触发作业,然后呈现一个页面以轮询已完成的作业。
然后在您的后台作业中,您处理的只是将 HTML 呈现为 PDF 的任务,而不是在 Web 请求中。沿着这些线的东西:
class RendersReportPdf
def self.call player_ids
html = ReportsController.render :director_report_pdf, assigns: { players: Player.where(id: player_ids }
pdf = WickedPdf.new.pdf_from_string html
temp = Tempfile.new("#{Time.now.to_i}.pdf")
temp.write(pdf)
temp.close
temp.path
# Probably upload this to S3 or similar at this point
# Notify the user that it's now available somehow
end
end
如果你这样做,那么你可以从你的控制器操作中排除 运行 WickedPDF 的问题,但你也要确保你的网站在长时间 - 运行 请求。
所以我想 post 我的解决方案供未来的访客使用,但它基于 @stef 的解决方案 - 谢谢 stef!
controllers/concerns/players_controller.rb
def generate_report_pdf
players = print_settings(params)
pdf_title = "#{params['pdf_title']} - #{Time.now.strftime("%c")}"
GeneratePdfJob.perform_later(players.pluck(:id), pdf_title, current_user.code, params["format"])
end
app/jobs/generate_pdf_job.rb
def perform(*args)
player_ids = args[0]
pdf_title = args[1]
user_code = args[2]
report_type = args[3]
generate_pdf_document(player_ids, pdf_title, user_code, report_type)
end
def generate_pdf_document(ids, pdf_title, user_code, report_type)
# select the proper template by the report type specified
case report_type
when "Labels"
html = ApplicationController.new.render_to_string(
template: 'players/board_labels.pdf.erb',
locals: { player_ids: ids },
margin: { top: 6, bottom: 0, left: 32, right: 32 }
)
when "Reports"
# ...
end
end
def save_to_pdf(html, pdf_title, user_code)
pdf = WickedPdf.new.pdf_from_string(
html,
pdf: "#{pdf_title}",
layout: 'print',
disable_internal_links: true,
disable_external_links: true,
disable_javascript: true,
encoding: 'utf-8'
)
pdf_name = "#{pdf_title}.pdf"
pdf_dir = Rails.root.join('public','uploads','reports',"#{user_code}")
pdf_path = Rails.root.join(pdf_dir,pdf_name)
# create the folder if it doesn't exist
FileUtils.mkdir_p(pdf_dir) unless File.directory?(pdf_dir)
# create a new file
File.open(pdf_path,'wb') do |file|
file.binmode
file << pdf.force_encoding("UTF-8")
end
end
通过这种方式,我然后使用 ajax 调用来继续检查用户指定目录中的新文件,并更新列出目录中文件的部分内容。我唯一不喜欢的是现在我必须有一个 table 用户文件列表。我宁愿将文件传送到客户端的浏览器进行下载 - 但还没有弄清楚如何让它工作。