SpamAssassin规则解释

SpamAssassin rules explaination

我在使用 SpamAssassin 时遇到了一些小问题。我找不到规则的文档。

例如规则 MIME_HTML_MOSTLY 我有这个 link : https://wiki.apache.org/spamassassin/Rules/MIME_HTML_MOSTLY 但显然文档不再可用,我没有找到新的 link.

你能帮我找到新的 wiki link 吗?

提前致谢。

并非所有规则都记录在 SpamAssassin wiki 上 — 太多规则无法做到这一点。您可以从 SpamAssassin 规则 QA 系统获得自动化 efficacy data for MIME_HTML_MOSTLY,但不能从定义中获得。

来自 rules/20_body_tests.cf 的规则(折扣翻译)的当前定义是:

# … line 139 (quite likely to change)
body MIME_HTML_MOSTLY       eval:check_mime_multipart_ratio('0.00','0.01')
describe MIME_HTML_MOSTLY   Multipart message mostly text/html MIME
# … rules/50_scores.cf line 616 (also quite likely to change)
score MIME_HTML_MOSTLY 0.1

这是一个 eval 规则,因此您必须查看 perl 代码才能确切了解它在做什么。

lib/Mail/SpamAssassin/Plugin/MIMEEval.pm 中,您会发现:

# … line 214
sub check_mime_multipart_ratio {
  my ($self, $pms, undef, $min, $max) = @_;

  $self->_check_attachments($pms) unless exists $pms->{mime_checked_attachments};
  return 0 unless exists $pms->{mime_multipart_ratio};
  return ($pms->{mime_multipart_ratio} >= $min &&
      $pms->{mime_multipart_ratio} < $max);
}

# … line 491
    if (defined($text) && defined($html) && $html > 0) {
      $pms->{mime_multipart_ratio} = ($text / $html);
    }

这意味着文本 MIME 部分的长度与 HTML MIME 部分的长度之比必须等于或大于零且小于 1%。

(行号来自当前的主干存储库,而不是版本。代码应该不会有太大变化,但行号可能会发生变化,尤其是在 .cf 文件中。)

这是 SpamAssassin 支持人员回答我的内容:

The wiki was mostly migrated to the ASF Confluence instance recently and is now at https://cwiki.apache.org/confluence/display/SPAMASSASSIN/. The old rules descriptions (which had not been maintained since v3.3) were not migrated, as they were largely outdated where they were not redundant.

I don't have a definitive reference for the decision to stop maintaining rule descriptions on the wiki, so there may be a more correct explanation out there in the heads of the people who were on the PMC at the time. However, my view is that this was the right decision because of how the default rules are managed. Rules can shift in and out of the update channel based on the automated QA process, and there is a continuous trickle of new rules, rule changes, and rule deletions coming from the development team that get integrated (or not) via RuleQA. There was never a functional process for maintaining the wiki pages for rules properly in conjunction with that continuous change process, and the descriptions were mostly not much more illuminating than the 'describe' lines in the rules files.