计算机科学中的隔板是什么?

What is bulkheading in computer science?

我正在阅读 Akka 中的 dispatchers,我了解到它用于隔离目的。计算机科学中的隔板是什么?

Solution: Dedicated dispatcher for blocking operations One of the most efficient methods of isolating the blocking behaviour such that it does not impact the rest of the system is to prepare and use a dedicated dispatcher for all those blocking operations. This technique is often referred to as as “bulk-heading” or simply “isolating blocking”.

引用 Jonas Bonér 在他于 2016 年 4 月发表的 keynote address 中的话:

Isolation of failure—being able to contain and manage failure without having it cascade—is a pattern sometimes referred to as Bulkheading.

Bulkheading has been used in the ship construction industry for centuries as a way to divide the ship into isolated watertight compartments, so that if a few compartments are filled up with water, the leak does not spread and the ship can continue to function and reach its destination.

Resilience—the ability to heal from failure—depends on compartmentalization and containment of failure, and can only be achieved by breaking free from the strong coupling of synchronous communication.


在 Akka 系统中,人们通常通过调度程序调整来实现隔离,正如 Jamie Allen 在 blog post 中所描述的那样,以下是摘录:

One of the biggest questions I encounter among users of Akka is how to use dispatchers to create failure zones and prevent failure in one part of the application from affecting another. This is sometimes called the Bulkhead Pattern....

The key to separating actors into failure zones is to identify their risk profile. Is a task particularly dangerous, such as network IO? Is it a task that requires blocking, such as database access? In those cases, you want to isolate those actors and their threads from those doing work that is less dangerous. If something happens to a thread that results in it completely dying and not being available from the pool, isolation is your only protection so that unrelated actors aren’t affected by the diminishment of resources.

You also may want to identify areas of heavy computation through profiling, and break those tasks out using tools such as Routers (no shared mailboxes and thus no work-stealing) and BalancingDispatcher (one mailbox for all “routees”, and therefore work-stealing in nature). For those tasks that you assign to Routers, you might also want them to operate on their own dispatcher so that the intense computation tasks do not starve other actors waiting for a thread to perform their work.

Akka 文档还描述了 the use of dispatchers to manage blocking


除了调整调度程序,在 Akka 中还可以使用 circuit breakers 来实现隔板。断路器是一种可配置的机制,可防止级联故障。该文档给出了以下示例:

As an example, we have a web application interacting with a remote third party web service. Let’s say the third party has oversold their capacity and their database melts down under load. Assume that the database fails in such a way that it takes a very long time to hand back an error to the third party web service. This in turn makes calls fail after a long period of time. Back to our web application, the users have noticed that their form submissions take much longer seeming to hang. Well the users do what they know to do which is use the refresh button, adding more requests to their already running requests. This eventually causes the failure of the web application due to resource exhaustion. This will affect all users, even those who are not using functionality dependent on this third party web service.

Introducing circuit breakers on the web service call would cause the requests to begin to fail-fast, letting the user know that something is wrong and that they need not refresh their request. This also confines the failure behavior to only those users that are using functionality dependent on the third party, other users are no longer affected as there is no resource exhaustion. Circuit breakers can also allow savvy developers to mark portions of the site that use the functionality unavailable, or perhaps show some cached content as appropriate while the breaker is open.