使用一元运算符时符号回归树提前停止

tree for symbolic regression early stops when unary operator is used

我正在研究符号回归算法,现在我有这个节点

class Node:
    def __init__(self):
        self.left = None
        self.right = None
        self.arity = 0
        self.left_col = 0
        self.right_col = 0
        self.operator_ = 0

对于要打印的树,我们可以使用这个测试集

2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
2,3,2,49,58,2,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,

现在我的程序开始加载文件并创建包含测试集数据的节点,然后我们继续打印树,如您所见:

class Program:
    def __init__(self):
        self.tree = None

    def load_tree(self, p, file):
        line = file.readline()
        x = line.split(',')
        x.pop(-1)
        x = [int(i) for i in x]
        print(x)
        if (x[0] == -1):
            return None
        else:
            p.arity = x[0]
            p.left_col = x[3]
            p.right_col = x[4]
            p.operator_ = x[5]
            p.left = Node()
            p.right = Node()
            p.left = self.load_tree(p.left, file)
            p.right = self.load_tree(p.right, file)
        return p

    def load_program(self, PROGNAME):
        file = open(str(PROGNAME) + ".csv", "r")
        if self.tree == None:
            self.tree = Node()
        self.load_tree(self.tree, file)
        file.close()

    def print_tree(self, tree, ncol):
        result = ""
        operators = {0: "not", 1: "shift", 2: "+", 3: "-", 4: "*",
                     5: "/", 6: ">", 7: "<", 8: "=="}
        if tree == None:
            result += "v" + str(ncol)
        elif tree.arity == 1:
            result += operators[tree.operator_] + \
                "(" + self.print_tree(tree.left, tree.left_col) + ")"
        elif tree.arity == 2:
            result += "(" + self.print_tree(tree.left, tree.left_col) + \
                operators[tree.operator_] + \
                self.print_tree(tree.right, tree.right_col) + ")"
        return result

    def print_program(self):
        y = "y = " + self.print_tree(self.tree, 0)
        print(y)
prog = Program()

# load_program
prog.load_program(0)

# print_program
prog.print_program()

输出公式为:

y = ((v48<v37)+(v49+((v62/v19)+v59)))

但是现在如果我们在第三行(非-1)中使用这个带有一元运算符的数据集

2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
1,3,2,49,58,0,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,

此时输出停止计算,这里输出

y = ((v48<v37)+not(v49))

为什么会这样?如何解决才能结束计算?

提前致谢!

版本:

我已经更改了 load_tree 添加 if (x[0]==1): 看起来像这样,但我有相同的输出

def load_tree(self, p, file):
    line = file.readline()
    x = line.split(',')
    x.pop(-1)
    x = [int(i) for i in x]
    print(x)
    if (x[0] == -1):
        return None
    if (x[0] == 1):
        p.arity = x[0]
        p.left_col = x[3]
        p.operator_ = x[5]
        p.left = Node()
        p.left = self.load_tree(p.left, file)
        
    elif (x[0] == 2):
        p.arity = x[0]
        p.left_col = x[3]
        p.right_col = x[4]
        p.operator_ = x[5]
        p.left = Node()
        p.right = Node()
        p.left = self.load_tree(p.left, file)
        p.right = self.load_tree(p.right, file)
    return p

y = ((v48

load_tree 中的代码忽略了元数,即使对于“非”运算符,也需要左右子节点。 “not”的右子节点是剩余的“+”操作,而“+”操作又是左子节点的“/”操作。

另一方面,

print_tree 忽略一元运算的右子节点。