使用 Wicked_PDF gem 在 Rails 中生成 PDF 时使用率高 CPU

High CPU Usage when generating PDFs in Rails with Wicked_PDF gem

我正在尝试使用 Rails 生成 PDF 文件,但是当我这样做时,我注意到我的系统 CPU 开始达到最大值。最初,它会从 ~2.5% 上升到 ~65%-$80% 一段稳定的时间,然后最后几乎在我的页面上的 iframe 中显示 PDF 之前达到最大值。以下是我在监视系统内存使用情况时收到的一些消息:

Warning or critical alerts (lasts 9 entries)
                          2017-06-09 14:58:07 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
                          2017-06-09 14:58:04 (0:00:13) - CRITICAL on CPU_USER (Min:72.8 Mean:83.3 Max:93.7)
                          2017-06-09 14:47:39 (0:00:06) - CRITICAL on CPU_USER (93.0)
                          2017-06-09 14:47:29 (0:00:04) - WARNING on CPU_SYSTEM (74.7)
                          2017-06-09 14:36:48 (0:00:04) - CRITICAL on CPU_SYSTEM (100.0)
                          2017-06-09 14:36:45 (0:00:10) - CRITICAL on CPU_IOWAIT (Min:78.6 Mean:85.7 Max:97.4)
                          2017-06-09 14:18:06 (0:00:04) - CRITICAL on CPU_SYSTEM (94.3)
                          2017-06-09 14:18:06 (0:00:07) - CRITICAL on CPU_USER (91.0)
2017-06-09 15:01:14       2017-06-09 14:17:44 (0:00:04) - WARNING on CPU_SYSTEM (73.8)

我为 PDF 生成安装的 gem 是 wicked_pdf (1.0.6)wkhtmltopdf-binary-edge (0.12.4.0)。并且每个过程的代码如下:

controllers/concerns/pdf_player_reports.rb

def director_report_pdf
  @players = Player.where(id: params["player_ids"]

  respond_to do |format|
  format.html
  format.pdf do
    render pdf: "#{params['pdf_title']}",
      template: 'players/director_summary_report.pdf.erb',
        layout: 'print',
        show_as_html: params.key?('debug'),
        window_status: 'Loading...',
        disable_internal_links: true,
        disable_external_links: true,
        dpi: 75,
        disable_javascript: true,
        :margin => {:top => 7, :bottom  => 7, :left => 6, :right => 0},
        encoding: 'utf8'
  end
end

players/director_summary_report.pdf.erb

<div class="document" style="margin-top: -63px;">
  <% @players.each do |player| %>
     <% reports = player.reports.order(created_at: :desc) %>
     <% if player.is_college_player? %>
       <%= render partial: 'college_director_report.html.erb', player: player %>
     <% else %>
       <%= render partial: 'pro_director_report.html.erb', player: player %>
     <% end %>
     <%= "<div class='page-break'></div>".html_safe %>
  <% end %>
</div>

college_director_report.html.erb

<%= wicked_pdf_stylesheet_link_tag "application", media: "all" %>
<%= wicked_pdf_javascript_include_tag "application" %>
<% provide(:title, "#{player.football_name}") %>
<% self.formats = [:html, :pdf, :css, :coffee, :scss] %>

<style>
    thead { display: table-row-group; page-break-inside: avoid }
    tfoot { display: table-row-group; }
    /*thead:before, thead:after { display: none; }*/
    table { page-break-inside: avoid; }
    tr { page-break-inside: avoid; }
    .page-break {
        display:block; clear:both; page-break-after:always;
    }
    .keep-together { page-break-before: always !important; }
    .table-striped>tbody>tr:nth-child(odd)>td,
    tr.found{
        background-color:#e2e0e0 !important;
    }
</style>

<div class="row">
    <div class="col-xs-6">
        <span>DIRECTOR SUMMARY</span>
    </div>
    <div class="col-xs-6 text-right">
        <%= "#{player.full_name} / #{player.school.short_name}".upcase %>
        <h1><%= "#{player.full_name(true)} (#{player.school.code})".upcase %></h1>
    </div>
</div>

<div class="row">
  <div class="col-xs-12">
    <%= render 'directors_report_player_header', player: player %>
    <%= render 'directors_report_workouts', player: player %>
    <%= render 'directors_report_grades', player: player %>
    <%= render 'legacy_directors_report_contacts', player: player %>
  </div>
</div>

directors_report_player_header.html.erb

<table class="table table-condensed table-bordered">
    <thead>
        <tr>
            <th>Name</th>
            <th>School</th>
            <th>#</th>
            <th>Position</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td><%= player.full_name(true) %></td>
            <td><%= player.school.short_name %></td>
            <td><%= player.jersey %></td>
            <td><%= player.position.abbreviation %></td>
        </tr>
    </tbody>
</table>

更新

我 运行 使用以下内容的示例 PDF 生成器和 CPU% 最终达到最大值,如下所示...

 <table class="table table-condensed">
    <thead>
      <th>Number</th>
    </thead>
    <tbody>
      <% (1..60000).each do |number| %>
        <tr>
          <td><%= number %></td>
        </tr>
      <% end %>
    </tbody>
  </table>

将此放在控制器中似乎是不明智的,因为在部署此请求的那一刻,请求将花费大量时间来生成和阻止对其他页面的其他传入请求。

您应该将其分为两个问题。生成 HTML 的一项作业可能是此控制器,然后是将 HTML 转换为 PDF 格式的后台任务。

在您的控制器中,使用 DelayedJob 或类似工具触发作业,然后呈现一个页面以轮询已完成的作业。

然后在您的后台作业中,您处理的只是将 HTML 呈现为 PDF 的任务,而不是在 Web 请求中。沿着这些线的东西:

class RendersReportPdf
  def self.call player_ids
    html = ReportsController.render :director_report_pdf, assigns: { players: Player.where(id: player_ids }
    pdf = WickedPdf.new.pdf_from_string html    
    temp = Tempfile.new("#{Time.now.to_i}.pdf")
    temp.write(pdf)
    temp.close
    temp.path
    # Probably upload this to S3 or similar at this point
    # Notify the user that it's now available somehow
  end
end

如果你这样做,那么你可以从你的控制器操作中排除 运行 WickedPDF 的问题,但你也要确保你的网站在长时间 - 运行 请求。

所以我想 post 我的解决方案供未来的访客使用,但它基于 @stef 的解决方案 - 谢谢 stef!

controllers/concerns/players_controller.rb

  def generate_report_pdf
    players = print_settings(params)
    pdf_title = "#{params['pdf_title']} - #{Time.now.strftime("%c")}"
    GeneratePdfJob.perform_later(players.pluck(:id), pdf_title, current_user.code, params["format"])
  end

app/jobs/generate_pdf_job.rb

  def perform(*args)

    player_ids  = args[0]
    pdf_title   = args[1]
    user_code   = args[2]
    report_type = args[3]

    generate_pdf_document(player_ids, pdf_title, user_code, report_type)

  end

  def generate_pdf_document(ids, pdf_title, user_code, report_type)

    # select the proper template by the report type specified
    case report_type
         when "Labels"
           html = ApplicationController.new.render_to_string(
             template: 'players/board_labels.pdf.erb',
               locals: { player_ids: ids },
               margin: { top: 6, bottom: 0, left: 32, right: 32 }
           )
         when "Reports"
           # ...
    end
  end

  def save_to_pdf(html, pdf_title, user_code)

    pdf = WickedPdf.new.pdf_from_string(
                          html,
                          pdf: "#{pdf_title}",
                       layout: 'print',
       disable_internal_links: true,
       disable_external_links: true,
           disable_javascript: true,
                     encoding: 'utf-8'
      )

    pdf_name = "#{pdf_title}.pdf"
    pdf_dir = Rails.root.join('public','uploads','reports',"#{user_code}")
    pdf_path = Rails.root.join(pdf_dir,pdf_name)

    # create the folder if it doesn't exist
    FileUtils.mkdir_p(pdf_dir) unless File.directory?(pdf_dir)

    # create a new file
    File.open(pdf_path,'wb') do |file|
      file.binmode
      file << pdf.force_encoding("UTF-8")
    end

  end

通过这种方式,我然后使用 ajax 调用来继续检查用户指定目录中的新文件,并更新列出目录中文件的部分内容。我唯一不喜欢的是现在我必须有一个 table 用户文件列表。我宁愿将文件传送到客户端的浏览器进行下载 - 但还没有弄清楚如何让它工作。