forkProcess有多危险?我怎样才能安全地使用它?

How dangerous is forkProcess? How can I use it safely?

我想和 forkProcess 玩点把戏,我想克隆我的 Haskell 进程,然后让两个克隆相互通信(可能使用 Cloud Haskell发送均匀的闭包)。

但我想知道它与 GHC 运行time 的配合情况如何。有人有这方面的经验吗?

forkProcess 的文档说没有其他线程被复制,所以我假设其他线程使用的所有数据将在 fork 中被垃圾收集,这听起来不错。但这意味着终结器将在两个克隆中 运行,这可能是也可能不是正确的做法……

我想我不能放心地使用它;但是有没有我可以遵循的规则来确保它的使用安全?

But that means that finalizers will run in both clone, which may or may not be the right thing to do…

Finalizers 在 Haskell 中很少使用,即使在使用它们的地方,我也希望它们只具有过程中的效果。例如,如果您忘记自己执行终结器,则终结器会在垃圾收集句柄上调用 hClose。这很容易证明:以下程序失败并显示 openFile: resource exhausted (Too many open files),但如果您取消注释 pure (),句柄将被垃圾收集并且程序成功完成。

import Control.Concurrent
import Control.Monad
import System.IO
import System.Mem

main :: IO ()
main = do
  rs <- replicateM 1000 $ do
    threadDelay 1000  -- not sure why did is needed; maybe to give control back
                      -- to the OS, so it can recycle the file descriptors?
    performGC
    openFile "input" ReadMode
    --pure ()
  print rs  -- force all the Handles to still be alive by this point

文件描述符由进程拥有并由 forkProcess 复制,因此让每个克隆关闭它们的副本是有意义的。

有问题的情况是终结器正在清理系统拥有的资源,例如删除一个文件。但我希望没有库依赖终结器来删除此类资源,因为 as the documentation explains, finalizers are not guaranteed to run. So it's better to use something like bracket to cleanup resources (although the cleanup is still not guaranteed, e.g. if bracket is used from a thread).

forkProcess 的文档警告的不是终结器,而是其他线程似乎在分叉进程内突然结束的事实。如果这些线程持有锁,这尤其成问题。通常,两个线程可以使用 modifyMVar_ 来确保一次只有一个线程是 运行 临界区,并且只要每个线程只在有限的时间内持有锁,另一个线程可以简单地等待 MVar 变得可用。但是,如果您在一个线程处于 modifyMVar_ 中间时调用 forkProcess,该线程将不会在克隆进程中继续,因此克隆进程不能简单地调用 modifyMVar_ 或它在等待不存在的线程释放锁时可能会永远卡住。这是一个演示问题的程序。

import Control.Concurrent
import Control.Monad
import System.Posix.Process

-- >>> main
-- (69216,"forkIO thread",0)
-- (69216,"main thread",1)
-- (69216,"forkIO thread",2)
-- (69216,"main thread",3)
-- (69216,"forkIO thread",4)
-- (69216,"main thread",5)
-- calling forkProcess
-- forkProcess main thread waiting for MVar...
-- (69216,"forkIO thread",6)
-- (69216,"original main thread",7)
-- (69216,"forkIO thread",8)
-- (69216,"original main thread",9)
-- (69216,"forkIO thread",10)
-- (69216,"original main thread",11)
main :: IO ()
main = do
  mvar <- newMVar (0 :: Int)
  _ <- forkIO $ replicateM_ 6 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "forkIO thread", i)
      pure (i+1)
  threadDelay 50000
  replicateM_ 3 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "main thread", i)
      pure (i+1)
  putStrLn "calling forkProcess"
  _ <- forkProcess $ do
    threadDelay 25000
    replicateM_ 3 $ do
      putStrLn "forkProcess main thread waiting for MVar..."
      modifyMVar_ mvar $ \i -> do
        threadDelay 100000
        processID <- getProcessID
        print (processID, "forkProcess main thread", i)
        pure (i+1)
  replicateM_ 3 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "original main thread", i)
      pure (i+1)
  threadDelay 100000

如输出所示,forkProcess 主线程卡住,永远等待 MVar,并且永远不会打印 forkProcess main thread 行。如果将 threadDelay 移到 modifyMVar_ 临界区之外,当 forkProcess 被调用时,forkIO 线程不太可能位于该临界区的中间,因此您将看到一个看起来像这样的输出:

(69369,"forkIO thread",0)
(69369,"main thread",1)
(69369,"forkIO thread",2)
(69369,"main thread",3)
(69369,"forkIO thread",4)
(69369,"main thread",5)
calling forkProcess
(69369,"forkIO thread",6)
(69369,"original main thread",7)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",6)
(69369,"forkIO thread",8)
(69369,"original main thread",9)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",7)
(69369,"forkIO thread",10)
(69369,"original main thread",11)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",8)

forkProcess 调用之后,现在有两个 MVar 都保存值 5,因此在原始过程中,original main threadforkIO thread 都递增一个 MVar,而在另一个过程中 forkProcess main thread 正在递增另一个。