Words 中的反斜杠冲突和 Pyparsing 中的换行符

Question

我在使用允许在参数名称中使用“\”的语法时遇到困难（例如 net\<8>）。但是，'\' 也用作续行（请参阅 Ex2.）。 Ex1 工作正常，但 linebreak 和 identifier 变量之间存在冲突。

Ex1: 工作 (netlist.sp)

subckt INVERTER A Z gnd gnds vdd vdds
M1 (Z A vdd vdds) pmos w=0.4 l=0.1
M2 (Z A gnd gnds) nmos w=0.2 l=0.1
ends INVERTER

I1 (net1 net2 0 gnds! vdd! vdds!) INVERTER

subckt INVERTER_2 A Z gnd gnds vdd vdds
M1 (Z A vdd vdds) pmos w=0.4 l=0.1
M2 (Z A gnd gnds) nmos w=0.2 l=0.1
ends INVERTER_2

I2 (net\<8\> net2 0 gnds! vdd! vdds!) INVERTER_2

I3 (net1 net2 0 gnds! vdd! vdds!) INVERTER_2

Ex2: 不工作 (netlist2.sp)

subckt INVERTER A Z gnd gnds vdd vdds
M1 (Z A vdd vdds) pmos w=0.4 l=0.1
M2 (Z A gnd gnds) nmos w=0.2 l=0.1
ends INVERTER

I1 (net1 net2 0 gnds! vdd! vdds!) INVERTER

subckt INVERTER_2 A Z gnd gnds \
                  vdd vdds
M1 (Z A vdd vdds) pmos w=0.4 l=0.1
M2 (Z A gnd gnds) nmos w=0.2 l=0.1
ends INVERTER_2

I2 (net\<8\> net2 0 gnds! vdd! vdds!) INVERTER_2

I3 (net1 net2 0 gnds! vdd! vdds!) INVERTER_2

密码

import pyparsing as pp
import json

EOL = pp.LineEnd().suppress() # end of line
linebreak = pp.Suppress(pp.Keyword('\') + pp.LineEnd())
identifier = pp.Word(pp.alphanums + '_!<>\')
number = pp.pyparsing_common.number
net = identifier
instref = identifier
instname = identifier
subcktname = identifier
subcktname_end = pp.Keyword("ends").suppress()
comment = pp.Suppress("//" + pp.SkipTo(pp.LineEnd()))
expression = pp.Word(pp.alphanums + '._*+-/()')

input_file = open(netlist.sp,'r')
file_string = input_file.read()
input_file.close()

for t, s, e in parse_netlist().scanString(file_string):
    print(json.dumps(t.asDict()['netlist'], indent=2))

def parse_netlist():
        pp.ParserElement.setDefaultWhitespaceChars(' \t')

        nets = (pp.Optional(pp.Suppress('('))
                + pp.OneOrMore(net('net') | linebreak)
                + pp.Optional(pp.Suppress(')'))
               )

        inst_param_value = expression('expression')

        inst_parameter = pp.Dict(pp.Group(identifier('param_name')
                                  + pp.Suppress("=")
                                  + inst_param_value('param_value')
                                 ))

        parameters = pp.Group(pp.OneOrMore(inst_parameter | linebreak)
                             ).setResultsName('parameters')

        instance = pp.Dict(pp.Group(instname('inst_name')
                            + nets('nets')
                            + instref('reference')
                            + pp.Optional(parameters)
                            + EOL
                           )).setResultsName('instance', listAllMatches=True)

        subckt_core = pp.Group(pp.ZeroOrMore(instance | EOL | comment)
                              ).setResultsName('subckt_core', listAllMatches=True)

        subckt = pp.Group(pp.Keyword("subckt").suppress()
                          + subcktname('subckt_name')
                          + nets('nets')
                          + EOL
                          + subckt_core
                          + subcktname_end
                          + pp.matchPreviousExpr(subcktname).suppress()
                          + EOL
                         ).setResultsName('subcircuit', listAllMatches=True)

        netlist = pp.OneOrMore(subckt
                               | instance
                               | comment('comment')
                               | EOL
                              ).setResultsName('netlist') + pp.StringEnd()

        return netlist

Ex1 的输出

[
  {
    "subckt_name": "INVERTER",
    "net": "vdds",
    "nets": [
      "A",
      "Z",
      "gnd",
      "gnds",
      "vdd",
      "vdds"
    ],
    "subckt_core": [
      {
        "instance": [
          {
            "M1": {
              "inst_name": "M1",
              "net": "vdds",
              "nets": [
                "Z",
                "A",
                "vdd",
                "vdds"
              ],
              "reference": "pmos",
              "parameters": {
                "w": "0.4",
                "l": "0.1"
              }
            }
          },
          {
            "M2": {
              "inst_name": "M2",
              "net": "gnds",
              "nets": [
                "Z",
                "A",
                "gnd",
                "gnds"
              ],
              "reference": "nmos",
              "parameters": {
                "w": "0.2",
                "l": "0.1"
              }
            }
          }
        ]
      }
    ]
  },
  {
    "I1": {
      "inst_name": "I1",
      "net": "vdds!",
      "nets": [
        "net1",
        "net2",
        "0",
        "gnds!",
        "vdd!",
        "vdds!"
      ],
      "reference": "INVERTER",
      "parameters": []
    }
  },
  {
    "subckt_name": "INVERTER_2",
    "net": "vdds",
    "nets": [
      "A",
      "Z",
      "gnd",
      "gnds",
      "vdd",
      "vdds"
    ],
    "subckt_core": [
      {
        "instance": [
          {
            "M1": {
              "inst_name": "M1",
              "net": "vdds",
              "nets": [
                "Z",
                "A",
                "vdd",
                "vdds"
              ],
              "reference": "pmos",
              "parameters": {
                "w": "0.4",
                "l": "0.1"
              }
            }
          },
          {
            "M2": {
              "inst_name": "M2",
              "net": "gnds",
              "nets": [
                "Z",
                "A",
                "gnd",
                "gnds"
              ],
              "reference": "nmos",
              "parameters": {
                "w": "0.2",
                "l": "0.1"
              }
            }
          }
        ]
      }
    ]
  },
  {
    "I2": {
      "inst_name": "I2",
      "net": "vdds!",
      "nets": [
        "net\<8\>",
        "net2",
        "0",
        "gnds!",
        "vdd!",
        "vdds!"
      ],
      "reference": "INVERTER_2",
      "parameters": []
    }
  },
  {
    "I3": {
      "inst_name": "I3",
      "net": "vdds!",
      "nets": [
        "net1",
        "net2",
        "0",
        "gnds!",
        "vdd!",
        "vdds!"
      ],
      "reference": "INVERTER_2",
      "parameters": []
    }
  }
]

Ex2 的输出

[
  {
    "I2": {
      "inst_name": "I2",
      "net": "vdds!",
      "nets": [
        "INST_IN\<8\>",
        "net2",
        "0",
        "gnds!",
        "vdd!",
        "vdds!"
      ],
      "reference": "INVERTER2",
      "parameters": []
    }
  },
  {
    "I3": {
      "inst_name": "I3",
      "net": "vdds!",
      "nets": [
        "net1",
        "net2",
        "0",
        "gnds!",
        "vdd!",
        "vdds!"
      ],
      "reference": "INVERTER3",
      "parameters": []
    }
  }
]

语法：

格式化子电路定义：

subckt SubcircuitName [(] node1 ... nodeN [)]
[ parameters name1=value1 ... [nameN=valueN]]
.
.
.
instance, model, ic, or nodeset statements—or
further subcircuit definitions
.
.
.
ends [SubcircuitName]

格式化实例语句：

name [(]node1 ... nodeN[)] master [[param1=value1] ...[paramN=valueN]]

Answer 1

Word 是 pyparsing 中所有重复类型中最贪婪和最激进的一种。所以你的两个表达式：

linebreak = pp.Suppress(pp.Keyword('\') + pp.LineEnd())
identifier = pp.Word(pp.alphanums + '_!<>\')

会发生冲突。一旦标识符开始扫描匹配字符，它就不会向前看下一个表达式，看看它是否应该停止。

为了区分标识符中的“\”和延续标识符，您可以从 linebreak 开始。接下来，我们需要去掉标识符word中字符中的'\'：

identifier = pp.Word(pp.alphanums + '_!<>')

要在标识符中重新添加“\”，我们需要更加具体。不只是任何 '\' 都可以，我们只需要不是换行符的 '\'（即，那些不在行尾的）。我们可以通过负面的前瞻来做到这一点。在接受反斜杠之前，首先确保它不是换行反斜杠：

backslash_that_is_not_a_linebreak = ~linebreak + '\'

现在标识符将是一个或多个词项的集合，可以是您上面定义的标识符词，或者不是换行符的反斜杠。

identifier_word = pp.Word(pp.alphanums + '_!<>')
identifier = pp.OneOrMore(identifier_word | backslash_that_is_not_a_linebreak)

这让我们很接近，但是如果你使用这个标识符来解析 "net\<8>"，你会得到：

['net', '\', '<8', '\', '>']

如果您将标识符包装在 pyparsing Combine 中，那么一切都应该可以正常工作：

identifier = pp.Combine(pp.OneOrMore(identifier_word | backslash_that_is_not_a_linebreak))

print(identifier.parseString(r"net\<8\>"))

给出：

['net\<8\>']

编辑：总之，这是此更改所需的模组：

backslash_that_is_not_a_linebreak = ~linebreak + '\'
identifier_word = pp.Word(pp.alphanums + '_!<>')
identifier = pp.Combine(pp.OneOrMore(identifier_word | backslash_that_is_not_a_linebreak))

编辑2：这些行，在你的方法 parse_netlist 中声明，需要在模块的顶部，在导入 pyparsing 之后。否则，您的所有表达式（如换行符）都将使用默认的空白字符，包括 \n.

ws = ' \t'
pp.ParserElement.setDefaultWhitespaceChars(ws)

没有它们，nets 的表达式会读到 subckt 第一行的行尾，并包含“M2：作为另一个网络而不是作为标识符subckt_core.

中的第一个 instance

不确定你为什么这样分解你的解析器，最好把所有的位都放在一起。

Words 中的反斜杠冲突和 Pyparsing 中的换行符

Backslash conflict in Words and linebreak in Pyparsing

python

pyparsing

python-3.x

Ex1: 工作 (netlist.sp)

Ex2: 不工作 (netlist2.sp)

密码

Ex1 的输出

Ex2 的输出

语法：