非并发集合在并发集合中安全吗?

Are non-concurrent collections safe inside concurrent collections?

我希望开始在我正在进行的项目中实现一些并发功能。我最近发现了我打算利用的 System.Collections.Concurrent 名称空间。

我用来跟踪操作整体状态的对象本质上是一个包含一些嵌套自定义对象的字典。我的想法是,只要将最高级别的集合配置为 concurrent/thread 安全,嵌套集合是否安全并不重要,因为数据将被更高级别的集合锁定。

这是正确的假设吗?

例如,像下面这样的 PowerShell 中的东西可以安全使用吗?

[System.Collections.Concurrent.ConcurrentDictionary[[String], [MyCustomClass]]]::new()

此外,我有一些扩展 HashSet 以避免重复的自定义 class。自 System.Collections。 Concurrent 没有 HashSet class,获得类似功能但并发的推荐方法是什么?

My thought is that as long as the highest level collection is configured to be concurrent/thread safe, it doesn't matter if the nested collections are, since the data will be locked by the higher level collection.

Is this the correct assumption?

不,那是一个安全的假设。

假设您创建了一个包含一堆常规哈希表的并发字典:

using namespace System.Collections.Concurrent

# Create thread-safe dictionary
$rootDict = [ConcurrentDictionary[string,hashtable]]::new()
  • $rootDict 现在是线程安全的 - 多个线程不能通过覆盖对哈希表的引用来同时修改 'A' 条目
  • 我们添加到 $rootDict 的任何内部哈希表 不是 线程安全的 - 它仍然只是一个常规哈希表

在 PowerShell 7 中,使用 ForEach-Object -Parallel 对此类数据结构进行操作时可以观察到这一点:

using namespace System.Collections.Concurrent

# Create thread-safe dictionary
$rootDict = [ConcurrentDictionary[string,hashtable]]::new()

1..100 |ForEach-Object -Parallel {
  # We need a reference to our safe top-level dictionary
  $dict = $using:rootDict

  # ... and we need a key
  $rootKey = $_ % 2 -eq 0 ? 'even' : 'odd'

  # Thread-safe acquisition of inner hashtable
  $innerDict = $dict.GetOrAdd($rootKey, {param($key) return @{}})

  # Add a bit of jitter for realism
  Start-Sleep -Milliseconds (Get-Random -Minimum 50 -Maximum 250)

  # Update inner hashtable entry
  $innerDict['Counter'] += 1
} -ThrottleLimit 10

# Are these really the results we're expecting...? 
$rootDict['odd','even']

如果内部哈希表条目是线程安全的并发更新,你会期望两个计数器都在 50,但我在我的笔记本电脑上得到这样的结果:

Name                           Value
----                           -----
Counter                        46
Counter                        43

我们可以看到内部 'Counter' 条目的多个更新在此过程中丢失,可能是由于并发更新。


为了检验这个假设,让我们做同样的实验,但使用另一种并发字典类型而不是哈希表:

using namespace System.Collections.Concurrent

# Create thread-safe dictionary with a thread-safe item type
$rootDict = [ConcurrentDictionary[string,ConcurrentDictionary[string,int]]]::new()

1..100 |ForEach-Object -Parallel {
  # We need a reference to our safe top-level dictionary
  $dict = $using:rootDict

  # ... and we need a key
  $rootKey = $_ % 2 -eq 0 ? 'even' : 'odd'

  # Thread-safe acquisition of inner hashtable
  $innerDict = $dict.GetOrAdd($rootKey, {param($key) return @{}})

  # Add a bit of jitter for realism
  Start-Sleep -Milliseconds (Get-Random -Minimum 50 -Maximum 250)

  # Thread-safe update of inner dictionary
  [void]$innerDict.AddOrUpdate('Counter', {param($key) return 1}, {param($key,$value) return $value + 1})
} -ThrottleLimit 10

# These should be the exact results we're expecting! 
$rootDict['odd','even']

现在我得到:

Key     Value
---     -----
Counter    50
Counter    50

I have some custom class that extend HashSet to avoid duplicates. Since System.Collections. Concurrent don't have a HashSet class, what is the recommended way to get similar functionality but concurrently?

我强烈建议包装一个HashSet,而不是显式继承HashSet,然后保护所有你想暴露给用户的方法a ReaderWriterLockSlim - 这样您就可以在不牺牲读取访问性能的情况下实现线程安全。

在此,反对使用[int]作为示例日期类型:

using namespace System.Collections.Generic
using namespace System.Threading

class ConcurrentSet
{
    hidden [ReaderWriterLockSlim]
    $_lock

    hidden [HashSet[int]]
    $_set

    ConcurrentSet()
    {
        $this._set = [HashSet[int]]::new()
        $this._lock = [System.Threading.ReaderWriterLockSlim]::new()
    }

    [bool]
    Add([int]$item)
    {
        # Any method that modifies the set should be guarded
        # by a WriteLock - guaranteeing exclusive update access
        $this._lock.EnterWriteLock()
        try{
            return $this._set.Add($item)
        }
        finally{
            $this._lock.ExitWriteLock()
        }
    }

    [bool]
    IsSubsetOf([IEnumerable[int]]$other)
    {
        # For the read-only methods a read-lock will suffice
        $this._lock.EnterReadLock()
        try{
            return $this._set.IsSubsetOf($other)
        }
        finally{
            $this._lock.ExitReadLock()
        }
    }

    # Repeat appropriate lock pattern for all [HashSet] methods you want to expose
}

您可以通过 wrapping a HashSet<object> and controlling the behavior with a

使包装器更加灵活