使用一元运算符时符号回归树提前停止
tree for symbolic regression early stops when unary operator is used
我正在研究符号回归算法,现在我有这个节点
class Node:
def __init__(self):
self.left = None
self.right = None
self.arity = 0
self.left_col = 0
self.right_col = 0
self.operator_ = 0
对于要打印的树,我们可以使用这个测试集
2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
2,3,2,49,58,2,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
现在我的程序开始加载文件并创建包含测试集数据的节点,然后我们继续打印树,如您所见:
class Program:
def __init__(self):
self.tree = None
def load_tree(self, p, file):
line = file.readline()
x = line.split(',')
x.pop(-1)
x = [int(i) for i in x]
print(x)
if (x[0] == -1):
return None
else:
p.arity = x[0]
p.left_col = x[3]
p.right_col = x[4]
p.operator_ = x[5]
p.left = Node()
p.right = Node()
p.left = self.load_tree(p.left, file)
p.right = self.load_tree(p.right, file)
return p
def load_program(self, PROGNAME):
file = open(str(PROGNAME) + ".csv", "r")
if self.tree == None:
self.tree = Node()
self.load_tree(self.tree, file)
file.close()
def print_tree(self, tree, ncol):
result = ""
operators = {0: "not", 1: "shift", 2: "+", 3: "-", 4: "*",
5: "/", 6: ">", 7: "<", 8: "=="}
if tree == None:
result += "v" + str(ncol)
elif tree.arity == 1:
result += operators[tree.operator_] + \
"(" + self.print_tree(tree.left, tree.left_col) + ")"
elif tree.arity == 2:
result += "(" + self.print_tree(tree.left, tree.left_col) + \
operators[tree.operator_] + \
self.print_tree(tree.right, tree.right_col) + ")"
return result
def print_program(self):
y = "y = " + self.print_tree(self.tree, 0)
print(y)
prog = Program()
# load_program
prog.load_program(0)
# print_program
prog.print_program()
输出公式为:
y = ((v48<v37)+(v49+((v62/v19)+v59)))
但是现在如果我们在第三行(非-1)中使用这个带有一元运算符的数据集
2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
1,3,2,49,58,0,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
此时输出停止计算,这里输出
y = ((v48<v37)+not(v49))
为什么会这样?如何解决才能结束计算?
提前致谢!
版本:
我已经更改了 load_tree 添加 if (x[0]==1): 看起来像这样,但我有相同的输出
def load_tree(self, p, file):
line = file.readline()
x = line.split(',')
x.pop(-1)
x = [int(i) for i in x]
print(x)
if (x[0] == -1):
return None
if (x[0] == 1):
p.arity = x[0]
p.left_col = x[3]
p.operator_ = x[5]
p.left = Node()
p.left = self.load_tree(p.left, file)
elif (x[0] == 2):
p.arity = x[0]
p.left_col = x[3]
p.right_col = x[4]
p.operator_ = x[5]
p.left = Node()
p.right = Node()
p.left = self.load_tree(p.left, file)
p.right = self.load_tree(p.right, file)
return p
y = ((v48
load_tree
中的代码忽略了元数,即使对于“非”运算符,也需要左右子节点。 “not”的右子节点是剩余的“+”操作,而“+”操作又是左子节点的“/”操作。
另一方面,print_tree
忽略一元运算的右子节点。
我正在研究符号回归算法,现在我有这个节点
class Node:
def __init__(self):
self.left = None
self.right = None
self.arity = 0
self.left_col = 0
self.right_col = 0
self.operator_ = 0
对于要打印的树,我们可以使用这个测试集
2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
2,3,2,49,58,2,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
现在我的程序开始加载文件并创建包含测试集数据的节点,然后我们继续打印树,如您所见:
class Program:
def __init__(self):
self.tree = None
def load_tree(self, p, file):
line = file.readline()
x = line.split(',')
x.pop(-1)
x = [int(i) for i in x]
print(x)
if (x[0] == -1):
return None
else:
p.arity = x[0]
p.left_col = x[3]
p.right_col = x[4]
p.operator_ = x[5]
p.left = Node()
p.right = Node()
p.left = self.load_tree(p.left, file)
p.right = self.load_tree(p.right, file)
return p
def load_program(self, PROGNAME):
file = open(str(PROGNAME) + ".csv", "r")
if self.tree == None:
self.tree = Node()
self.load_tree(self.tree, file)
file.close()
def print_tree(self, tree, ncol):
result = ""
operators = {0: "not", 1: "shift", 2: "+", 3: "-", 4: "*",
5: "/", 6: ">", 7: "<", 8: "=="}
if tree == None:
result += "v" + str(ncol)
elif tree.arity == 1:
result += operators[tree.operator_] + \
"(" + self.print_tree(tree.left, tree.left_col) + ")"
elif tree.arity == 2:
result += "(" + self.print_tree(tree.left, tree.left_col) + \
operators[tree.operator_] + \
self.print_tree(tree.right, tree.right_col) + ")"
return result
def print_program(self):
y = "y = " + self.print_tree(self.tree, 0)
print(y)
prog = Program()
# load_program
prog.load_program(0)
# print_program
prog.print_program()
输出公式为:
y = ((v48<v37)+(v49+((v62/v19)+v59)))
但是现在如果我们在第三行(非-1)中使用这个带有一元运算符的数据集
2,4,0,7,77,2,
2,1,1,48,37,7,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
1,3,2,49,58,0,
-1,-1,-1,-1,-1,-1,
2,2,3,101,59,2,
2,1,4,62,19,5,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,
此时输出停止计算,这里输出
y = ((v48<v37)+not(v49))
为什么会这样?如何解决才能结束计算?
提前致谢!
版本:
我已经更改了 load_tree 添加 if (x[0]==1): 看起来像这样,但我有相同的输出
def load_tree(self, p, file):
line = file.readline()
x = line.split(',')
x.pop(-1)
x = [int(i) for i in x]
print(x)
if (x[0] == -1):
return None
if (x[0] == 1):
p.arity = x[0]
p.left_col = x[3]
p.operator_ = x[5]
p.left = Node()
p.left = self.load_tree(p.left, file)
elif (x[0] == 2):
p.arity = x[0]
p.left_col = x[3]
p.right_col = x[4]
p.operator_ = x[5]
p.left = Node()
p.right = Node()
p.left = self.load_tree(p.left, file)
p.right = self.load_tree(p.right, file)
return p
y = ((v48
load_tree
中的代码忽略了元数,即使对于“非”运算符,也需要左右子节点。 “not”的右子节点是剩余的“+”操作,而“+”操作又是左子节点的“/”操作。
print_tree
忽略一元运算的右子节点。