如何编写文本语法规则来检测标准数据类型而不修改它们？

Question

我想编写一个 textx 语法规则，它可以由另一个定义的规则或任何类型的标准数据类型（Int、Float、String 等）组成。

这是一个简单的 textx DSL，它应该有可能包含编写（并最终翻译）条件，这些条件可以包含其他语法规则（如预定义函数）或任何类型的标准预定义数据类型（ String/Int/Float/Bool/ID).

所以，我实际上希望能够写出类似的东西

condition insert input data 5 equal 10 BEGIN
    ...
END

这代表一个普通的IF。 insert input data 5 是一个规则，稍后会被翻译成一个正常的函数调用 insertOutputData(5)。我在那里使用的语法：

Model: commands*=Command;
Command: Function | Branch;
Function: Func_InsertInputData | Func_InsertOutputData;
Func_InsertInputData: 'insert input data' index=INT;
Func_InsertOutputData: 'insert output data' index=INT;
Branch: 'condition' condition=Condition 'BEGIN'
    commands*=Command;
'END'
Condition: Cond_Equal | Cond_And | Cond_False;
Cond_Equal: op1=Operand 'equal' op2=Operand;
Cond_And: op1=Operand 'and' op2=Operand;
Cond_False: op1=Operand 'is false';
Operand: Function | OR_ANY_OTHER_KIND_OF_DATA;

在解释器中，我尝试通过这样做来阅读代码：

def translateCommands(cmds):
    commands = []
    for cmd in cmds:
        commands.append(translateCommand(cmd))
    return commands

def translateCommand(cmd):
    print(cmd)
    print(cmd.__class__)
    if cmd.__class__.__name__ == 'int' or cmd.__class__.__name__ == 'float':
        return str(cmd)
    elif cmd.__class__.__name__ == 'str':
        return '\'' + cmd + '\''
    elif(cmd.__class__.__name__ == 'Branch'):
        s = ''
        if(cmd.condition.__class__.__name__ ==  'Cond_Equal'):
            s = 'if ' + translateCommand(cmd.condition.op1) + '==' + translateCommand(cmd.condition.op2) + ':'
        if(cmd.condition.__class__.__name__ == 'Cond_And'):
            s = 'if ' + translateCommand(cmd.condition.op1) + 'and' + translateCommand(cmd.condition.op2) + ':'
        # ...
        commandsInBlock = translateCommands(cmd.commands)
        for command in commandsInBlock:
            s += '\n    '+command
        return s

if insertInputData(5)==10.0:

textx.exceptions.TextXSyntaxError: None:13:43: error: Expected 'BEGIN' at position (13, 43) => 't equal 10*.0 BEGIN  '.

我希望看到的结果是

if insertInputData(5)==10:

或

if insertInputData(5)==10.0:

和

condition insert input data 5 equal 10.0 BEGIN
    ...
END

但 textx 似乎总是尝试将它在该位置获得的值转换为操作数规则中建议的类型，这在这种情况下是错误的。我必须如何修改我的规则，以便它在不修改任何内容的情况下适当地检测每种数据类型？

编辑 1

Igor Dejanović 刚刚描述了问题，我按照他给出的方法进行了操作。

语法（相关部分）：

Command: Function | Branch | MyNumber;
#...
Oparand: Function | MyNumber | BOOL | ID | STRING;
MyNumber: STRICTFLOAT | INT;
STRICTFLOAT: /[+-]?(((\d+\.(\d*)?|\.\d+)([eE][+-]?\d+)?)|((\d+)([eE][+-]?\d+)))(?<=[\w\.])(?![\w\.])/;

代码：

mm = metamodel_from_str(grammar)
mm.register_obj_processors({'STRICTFLOAT': lambda x: float(x)})

dsl_code = '''
10
10.5
'''
model = mm.model_from_str(dsl_code)
commands = iterateThroughCommands(model.commands)

这导致

10
<class 'int'>

'10.5'
<class 'str'>

所以，缺少使对象处理器工作的东西...

Answer 1

问题是每个有效整数都可以解释为 FLOAT，因此如果您将规则排序为 FLOAT | INT |...，您会得到一个 float 类型作为 FLOAT 规则将匹配，但如果您将规则订购为 INT | FLOAT|... 用于浮点数，解析器将消耗数字的一部分直到 . 并且解析将不会继续。

这在 textX 的开发版本（请参阅 CHANGELOG.md）中通过引入永远不会匹配整数的 STRICTFLOAT 规则解决，内置的 NUMBER 规则更改为首先尝试匹配 STRICTFLOAT 然后 INT.

下一个版本将是 2.0.0，我希望在接下来的几周内发布。与此同时，您可以直接从 github 安装或修改您的语法以具有如下内容：

MyNumber: STRICTFLOAT | INT;
STRICTFLOAT: /[+-]?(((\d+\.(\d*)?|\.\d+)([eE][+-]?\d+)?)|((\d+)([eE][+-]?\d+)))(?<=[\w\.])(?![\w\.])/;   // or the float format you prefer

并为将转换为 Python float 的 STRICTFLOAT 类型注册 object processor。升级到 textX 2.0.0 后，您应该在语法中将对 MyNumber 的引用替换为 NUMBER。

可以在 the reported issue

中找到更多信息

编辑 1：

由于报告的错误，建议的解决方案目前不起作用 here

编辑 2：

该bug已在开发版中修复。在 2.0.0 发布之前，您必须

pip install https://github.com/textX/textX/archive/master.zip

如果您不想更改默认类型，则根本不需要解决方法。

如何编写文本语法规则来检测标准数据类型而不修改它们？

How to write a textx grammar rule to detect standard datatypes without modifying them?

python

dsl

textx

编辑 1