Log4Net / LogEntries 使我们的生产站点瘫痪

Log4Net / LogEntries brought down our production site

今天早上我们发生了一件非常可怕的事情。我们的 QA 团队(幸运的是,他们在美国工作日开始之前就开始工作了)报告说我们的生产网站突然宕机了。那时我们没有做任何奇怪的事情;故障是突然的,同时影响了我们所有的环境。我进入生产 Web 服务器 (IIS),发现应用程序池已停止。我重新启动它,它立即又崩溃了。我检查了 Windows 事件查看器,并在日志中发现了以下错误:

来源:.NET 运行时

Application: w3wp.exe

Framework Version: v4.0.30319

Description: The process was terminated due to an unhandled exception.

Exception Info: System.Security.Authentication.AuthenticationException Stack: at System.Net.Security.SslState.ForceAuthentication(Boolean, Byte[], System.Net.AsyncProtocolRequest) at System.Net.Security.SslState.ProcessAuthentication(System.Net.LazyAsyncResult) at System.Net.Security.SslStream.AuthenticateAsClient(System.String) at LogentriesCore.AsyncLoggerBase.Run() at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Threading.ThreadHelper.ThreadStart()

来源:ASP.NET 4.0.30319.0

An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/1/ROOT

Process ID: 10156

Exception: System.Security.Authentication.AuthenticationException

Message: The remote certificate is invalid according to the validation procedure.

StackTrace: at System.Net.Security.SslState.StartSendAuthResetSignal(ProtocolToken message, AsyncProtocolRequest asyncRequest, Exception exception) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult) at System.Net.Security.SslStream.AuthenticateAsClient(String targetHost) at LogentriesCore.AsyncLoggerBase.Run() at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart()

此堆栈跟踪不包含我们自己的任何代码;唯一对我有意义的是关于 Logentries 的部分。所以我立即打开 web.config 文件并在 Log4Net 配置中注释掉以下行:

<appender-ref ref="LeAppender" />

其中 LeAppender 之前定义为

<appender name="LeAppender" type="log4net.Appender.LogentriesAppender, LogentriesLog4net">
  <UseHttp value="false" />
  <UseSsl value="true" />
  <layout type="log4net.Layout.PatternLayout">
    <conversionPattern value="%date %-5level %logger [host=%P{log4net:HostName}:appdomain=%appdomain:referenceid=%property{referenceid}] - %message%newline" />
  </layout>
</appender>

我重新启动了网站,一切都重新开始了,我们的客户 none 变得更聪明了;非常幸运的是,这一切都在营业时间之前发生并得到了解决(但已经很接近了!)。

这完全把我吓坏了。在我看来,Log4Net 应该是一个完全健壮的框架,不可能造成任何损害。在我看来,Logentries 的证书已过期(或类似情况),其结果立即导致我们的生产网站崩溃!

我们的 Log4Net/Logentries 设置有问题吗?

初始配置(不知道这有什么问题):

XmlConfigurator.Configure();

记录器本身是一个ILog,它是这样分配的:

_log = LogManager.GetLogger("Logger");

记录是这样完成的:

_log.Logger.Log(typeof(MyLogger), level, message, ex);

这里没有什么不妥。一年多来它一直运行良好。此外,正如我之前提到的,堆栈跟踪没有任何对我们自己代码的引用,所以这可能不是因为我们编写的任何错误代码。

知道哪里出了问题、如何修复它以及如何确保这种中断不会再次导致我们的系统崩溃吗?

"On the 14th of September new SSL certificates were issued for api.logentries.com. Due to constraints these certificates do not match older certificates that were previously provided for api.logentries.com. We advise all Users who are currently embedding old api.logentries.com certificates to switch to use your systems cert pool. For help on changing your setup please contact support@logentries.com"

这条消息本应在您今天第一次登录日志条目时显示。这就是为什么你的问题从今天开始。

如果你把你的log4net包升级到最新,你的logentries包升级到最新,你的问题就解决了。