您将如何使抽象的嵌套 dictionary/hashtable 对象可读(样式)?

How would you make abstracted nested dictionary/hashtable objects readable (style)?

我正在解析导出到 csv 中的数据库 table,其中在本质上是备忘录字段的地方嵌入了字段。 数据库还包含版本历史,csv 包含所有版本。 数据的基本结构是索引(顺序记录编号)、参考(特定外键)、序列(给定参考的记录顺序)和数据(要解析的数据的备注字段)。

您可以将“数据”字段视为限制为 80 个字符宽和 40 个字符深的文本文档,然后按打印顺序排列。每个记录条目都分配有一个升序索引。 作为参考,$myParser 是一个 [Microsoft.VisualBasic.FileIO.TextFieldParser],所以 ReadFields() returns 一行字段作为 array/list.

我的最终问题是,如何将其格式化为 reader 更直观?下面的代码是 powershell,我也对与 C# 相关的答案感兴趣,因为它是一种与语言无关的问题,尽管我认为 get/set 会在某种程度上使它变得微不足道。

考虑以下代码(2 层嵌套 dictionary/hash 中的 insert/update 例程):

enum cmtField
{
    Index = 0
    Sequence = 1
    Reference = 2
    Data = 4
}

$myRecords = [System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new() #this could be a hash table, but is more verbose this way
While($true) #there's actually control here, but this provides a simple loop assuming infinite data
{
    $myFields = $myParser.ReadFields() #read a line from the csvfile and return an array/list of fields for that line

    if(!$myRecords.ContainsKey($myFields[[cmtField]::Reference])) #if the reference of the current record is new
    {
        $myRecords.Add($myFields[[cmtField]::Reference],[System.Collections.Generic.Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
        $myRecords[$myFields[[cmtField]::Reference]].add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #create tier 2 sequence reference and data
    }
    else #if the reference aklready exists in the dictionary
    {
        if(!$myRecords[$myFields[[cmtField]::Reference]].ContainsKey($myFields[[cmtField]::Sequence])) #if the sequence ID of the current record is new
        {
            $myRecords[$myFields[[cmtField]::Reference]].Add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #add record at [reference][sequence]
        }
        else #if the sequence already exists for this reference
        {
            if($myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]].Index -lt $myFields[[cmtField]::Index]) #if the index of the currently read field is higher than the store index, it must be newer
            {
                $myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]] = $myFields[[cmtField]::Data] #replace with new data
            }
            #else discard currently read data (do nothing
        }
    }
}

坦率地说,试图使它易于阅读既让我头疼又让我的眼睛有点流血。字典越深入,它只会变得越来越混乱。我被困在支架汤和没有自我文档之间。

My ultimate question is, how can this be formatted to be more intuitive to the reader?

那...最终取决于“reader”是谁 - 是您的老板吗?你的同事?我?您会使用此代码示例来教别人编程吗?

就减少“混乱”而言,您可以立即采取几个步骤。

为了使您的代码更具可读性,我要更改的第一件事是在文件顶部添加一个 using namespace 指令:

using namespace System.Collections.Generic

现在您可以创建嵌套字典:

[Dictionary[int,Dictionary[int,string]]]::new()

... 相对于:

[System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new()

接下来我要减少的是像 $myFields[[cmtField]::Reference] 这样的重复索引访问模式——在循环顶部的初始赋值后你永远不会修改 $myFields,所以没有必要延迟它的解析.

while($true)
{
    $myFields = $myParser.ReadFields()

    $Reference = $myFields[[cmtField]::Reference]
    $Data      = $myFields[[cmtField]::Data]
    $Sequence  = $myFields[[cmtField]::Sequence]
    $Index     = $myFields[[cmtField]::Index]

    if(!$myRecords.ContainsKey($Reference)) #if the reference of the current record is new
    {
        $myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
        $myRecords[$Reference].Add($Sequence,$Data) #create tier 2 sequence reference and data
    }
    else 
    {
        # ...

最后,你可以通过放弃嵌套的 if/else 语句来极大地简化代码,而是将它分解成一系列必须一步一步通过的步骤,你最终会得到这样的东西:

using namespace System.Collections.Generic

enum cmtField
{
    Index = 0
    Sequence = 1
    Reference = 2
    Data = 4
}

$myRecords = [Dictionary[int,Dictionary[int,CommentRecord]]]::new() 
while($true) 
{
    $myFields = $myParser.ReadFields()

    $Reference = $myFields[[cmtField]::Reference]
    $Data = $myFields[[cmtField]::Data]
    $Sequence = $myFields[[cmtField]::Sequence]
    $Index = $myFields[[cmtField]::Index]

    # Step 1 - ensure tier 1 dictionary is present
    if(!$myRecords.ContainsKey($Reference))
    {
        $myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new())
    }
    
    # (now we only need to resolve `$myRecords[$Reference]` once)
    $record = $myRecords[$Reference]

    # step 2 - ensure sequence entry exists
    if(!$record.ContainsKey($Sequence))
    {
        $record.Add($Sequence, $Data)
    }

    # step 3 - handle superceding comment records
    if($record[$Sequence].Index -lt $Index) 
    {
        $record[$Sequence] = $Data 
    }
}

我个人认为这比原来的 if/else 方法更容易在眼睛(和头脑)上使用