如何使用 pycparser 删除 AST 节点?
How to remove AST nodes with pycparser?
让我们从考虑这个片段开始:
import sys
from pycparser import c_parser, c_ast, c_generator
text = r"""
void main() {
foo(1,3);
foo1(4);
x = 1;
foo2(4,
10,
3);
foo3(
"xxx"
);
}
"""
class FuncCallVisitor(c_ast.NodeVisitor):
def visit_FuncCall(self, node):
print('%s called at %s' % (node.name.name, node.name.coord))
if node.args:
self.visit(node.args)
class RemoveFuncCalls(c_generator.CGenerator):
def visit_FuncCall(self, n):
# fref = self._parenthesize_unless_simple(n.name)
# return fref + '(' + self.visit(n.args) + ')'
return ""
if __name__ == '__main__':
parser = c_parser.CParser()
ast = parser.parse(text)
v = FuncCallVisitor()
v.visit(ast)
print('-' * 80)
ast.show(showcoord=True)
generator = RemoveFuncCalls()
print('-' * 80)
print(generator.visit(ast))
上面的输出将是:
void main()
{
;
;
x = 1;
;
;
}
但我希望它变成这样:
void main()
{
x = 1;
}
所以我的问题是,使用 pycparser 从 AST 中删除 nodes/subtrees 的 canonical/idiomatic 方法是什么?
看起来 c_generator.CGenerator
调用 _generate_stmt
method for scope-like structures which appends ';\n'
(缩进)到 visit
for 语句的结果,即使它是一个空字符串。
要删除函数调用,我们可以像这样重载它
class RemoveFuncCalls(c_generator.CGenerator):
def _generate_stmt(self, n, add_indent=False):
if isinstance(n, c_ast.FuncCall):
return ''
else:
return super()._generate_stmt(n, add_indent)
有了那个
void main()
{
x = 1;
}
这看起来像你想要的。
我们来看一个案例
if (bar(42, "something"))
return;
如果我们需要它成为
if ()
return;
然后我们需要添加
def visit_FuncCall(self, n):
return ''
就像在 OP 中一样,因为 RemoveFuncCalls.visit_If
方法不调用 _generate_stmt
进行 cond
字段序列化。
更进一步
我不知道 "canonical/idiomatic way to delete nodes/subtrees from the AST with pycparser" 是什么,但我确实知道一个来自 stdlib 的 ast
模块——ast.NodeTransformer
class(对于一些人来说 pycparser
中没有原因)。
它将允许我们通过覆盖私有方法和修改 AST 本身来避免混淆 AST 序列化为 str
的方式
from pycparser import c_ast
class NodeTransformer(c_ast.NodeVisitor):
def generic_visit(self, node):
for field, old_value in iter_fields(node):
if isinstance(old_value, list):
new_values = []
for value in old_value:
if isinstance(value, c_ast.Node):
value = self.visit(value)
if value is None:
continue
elif not isinstance(value, c_ast.Node):
new_values.extend(value)
continue
new_values.append(value)
old_value[:] = new_values
elif isinstance(old_value, c_ast.Node):
new_node = self.visit(old_value)
setattr(node, field, new_node)
return node
def iter_fields(node):
# this doesn't look pretty because `pycparser` decided to have structure
# for AST node classes different from stdlib ones
index = 0
children = node.children()
while index < len(children):
name, child = children[index]
try:
bracket_index = name.index('[')
except ValueError:
yield name, child
index += 1
else:
name = name[:bracket_index]
child = getattr(node, name)
index += len(child)
yield name, child
对于我们的案例,它可以简单地进行子类化
class FuncCallsRemover(NodeTransformer):
def visit_FuncCall(self, node):
return None
并像
一样使用
...
ast = parser.parse(text)
v = FuncCallsRemover()
ast = v.visit(ast) # note that `NodeTransformer` returns modified AST instead of `None`
之后我们可以使用未修改的 c_generator.CGenerator
实例并获得相同的结果。
让我们从考虑这个片段开始:
import sys
from pycparser import c_parser, c_ast, c_generator
text = r"""
void main() {
foo(1,3);
foo1(4);
x = 1;
foo2(4,
10,
3);
foo3(
"xxx"
);
}
"""
class FuncCallVisitor(c_ast.NodeVisitor):
def visit_FuncCall(self, node):
print('%s called at %s' % (node.name.name, node.name.coord))
if node.args:
self.visit(node.args)
class RemoveFuncCalls(c_generator.CGenerator):
def visit_FuncCall(self, n):
# fref = self._parenthesize_unless_simple(n.name)
# return fref + '(' + self.visit(n.args) + ')'
return ""
if __name__ == '__main__':
parser = c_parser.CParser()
ast = parser.parse(text)
v = FuncCallVisitor()
v.visit(ast)
print('-' * 80)
ast.show(showcoord=True)
generator = RemoveFuncCalls()
print('-' * 80)
print(generator.visit(ast))
上面的输出将是:
void main()
{
;
;
x = 1;
;
;
}
但我希望它变成这样:
void main()
{
x = 1;
}
所以我的问题是,使用 pycparser 从 AST 中删除 nodes/subtrees 的 canonical/idiomatic 方法是什么?
看起来 c_generator.CGenerator
调用 _generate_stmt
method for scope-like structures which appends ';\n'
(缩进)到 visit
for 语句的结果,即使它是一个空字符串。
要删除函数调用,我们可以像这样重载它
class RemoveFuncCalls(c_generator.CGenerator):
def _generate_stmt(self, n, add_indent=False):
if isinstance(n, c_ast.FuncCall):
return ''
else:
return super()._generate_stmt(n, add_indent)
有了那个
void main()
{
x = 1;
}
这看起来像你想要的。
我们来看一个案例
if (bar(42, "something"))
return;
如果我们需要它成为
if ()
return;
然后我们需要添加
def visit_FuncCall(self, n):
return ''
就像在 OP 中一样,因为 RemoveFuncCalls.visit_If
方法不调用 _generate_stmt
进行 cond
字段序列化。
更进一步
我不知道 "canonical/idiomatic way to delete nodes/subtrees from the AST with pycparser" 是什么,但我确实知道一个来自 stdlib 的 ast
模块——ast.NodeTransformer
class(对于一些人来说 pycparser
中没有原因)。
它将允许我们通过覆盖私有方法和修改 AST 本身来避免混淆 AST 序列化为 str
的方式
from pycparser import c_ast
class NodeTransformer(c_ast.NodeVisitor):
def generic_visit(self, node):
for field, old_value in iter_fields(node):
if isinstance(old_value, list):
new_values = []
for value in old_value:
if isinstance(value, c_ast.Node):
value = self.visit(value)
if value is None:
continue
elif not isinstance(value, c_ast.Node):
new_values.extend(value)
continue
new_values.append(value)
old_value[:] = new_values
elif isinstance(old_value, c_ast.Node):
new_node = self.visit(old_value)
setattr(node, field, new_node)
return node
def iter_fields(node):
# this doesn't look pretty because `pycparser` decided to have structure
# for AST node classes different from stdlib ones
index = 0
children = node.children()
while index < len(children):
name, child = children[index]
try:
bracket_index = name.index('[')
except ValueError:
yield name, child
index += 1
else:
name = name[:bracket_index]
child = getattr(node, name)
index += len(child)
yield name, child
对于我们的案例,它可以简单地进行子类化
class FuncCallsRemover(NodeTransformer):
def visit_FuncCall(self, node):
return None
并像
一样使用...
ast = parser.parse(text)
v = FuncCallsRemover()
ast = v.visit(ast) # note that `NodeTransformer` returns modified AST instead of `None`
之后我们可以使用未修改的 c_generator.CGenerator
实例并获得相同的结果。