我如何到达 AST 表达式的底部

How do I get to the bottom of an AST Expression

我是 AST 的新手(我第一次写插件)。现实生活中的表达方式可能非常复杂。例如,我想知道如何解析赋值的左侧和右侧。

class Visitor extends ASTVisitor
{
    @Override
    public boolean visit(Assignment node)
    {
        //here, how do I get the final name to each each side of the assignment resolves?
    }
}

我还有一个疑问,如何获取用于调用方法的实例?

public boolean visit(MethodInvocation node)
{
    //how do I get to know the object used to invoke this method?
    //like, for example, MyClass is a class, and it has a field called myField
    //the type of myField has a method called myMethod.
    //how do I find myField? or for that matter some myLocalVariable used in the same way.
}

假设如下赋值

SomeType.someStaticMethod(params).someInstanceMethod(moreParams).someField =
     [another expression with arbitrary complexity]

如何从 Assigment 节点到达 someField

另外,MethodInvocation 中的 属性 给出了用于调用该方法的实例?

编辑 1: 鉴于我收到的答案,我的问题显然不清楚。我不想解决 this 特定的表达式。我希望能够在给定任何分配的情况下找出分配给它的名称,以及分配给第一个的名称(如果不是右值)。

因此,例如,方法调用的参数可以是字段访问或先前声明的局部变量。

SomeType.someStaticMethod(instance.field).someInstanceMethod(type.staticField, localVariable, localField).Field.destinationField

所以,这是一个很有希望的 objective 问题:给定任何左侧和右侧都具有任意复杂度的赋值语句,如何获得最终的 field/variable 被赋值,以及分配给它的最终(如果有)field/variable。

编辑 2: 更具体地说,我想要实现的是不变性,通过注释 @Const:

/**
* When Applied to a method, ensures the method doesn't change in any
* way the state of the object used to invoke it, i.e., all the fields
* of the object must remain the same, and no field may be returned,
* unless the field itself is marked as {@code @Const} or the field is
* a primitive non-array type. A method  annotated with {@code @Const} 
* can only invoke other {@code @Const} methods of its class, can only 
* use the class's fields to invoke {@code @Const} methods of the fields 
* classes and can only pass fields as parameters to methods that 
* annotate that formal parameter as {@code @Const}.
*
* When applied to a formal parameter, ensures the method will not
* modify the value referenced by the formal parameter. A formal   
* parameter annotated as {@code @Const} will not be aliased inside the
* body of the method. The method is not allowed to invoke another 
* method and pass the annotated parameter, save if the other method 
* also annotates the formal parameter as {@code @Const}. The method is 
* not allowed to use the parameter to invoke any of its type's methods,
* unless the method being invoked is also annotated as {@code @Const}
* 
* When applied to a field, ensures the field cannot be aliased and that
* no code can alter the state of that field, either from inside the   
* class that owns the field or from outside it. Any constructor in any
* derived class is allowed to set the value of the field and invoke any
* methods using it. As for methods, only those annotated as
* {@code @Const} may be invoked using the field. The field may only be
* passed as a parameter to a method if the method annotates the 
* corresponding formal parameter as {@code @Const}
* 
* When applied to a local variable, ensures neither the block where the
* variable is declared or any nested block will alter the value of that 
* local variable. The local variable may be defined only once, at any
* point where it is in scope and cannot be aliased. Only methods
* annotated as {@code @Const} may be invoked using this variable, and 
* the variable  may only be passed as a parameter to another method if 
* said method annotates its corresponding formal parameter as
* {@code @Const}
*
*/
@Retention(RetentionPolicy.SOURCE)
@Target({ElementType.METHOD, ElementType.PARAMETER, ElementType.FIELD,
ElementType.LOCAL_VARIABLE})
@Inherited
public @interface Const
{

}

为了实现这一点,我要做的第一件事是将赋值的左侧标记为 @Const 的情况(很简单)。我还必须检测 and 表达式的右侧何时是标记为 @Const 的字段,在这种情况下,它只能在相同类型的 @Const 变量的定义中赋值。

问题是我很难找到表达式右侧的最终字段,以避免为该字段设置别名并使 @Const 注释变得无用。

您可以进一步访问node.getLeftHandSide()。 我觉得一个很好的例子可以在Sharpen(Java2C#翻译)代码中找到:

https://github.com/mono/sharpen/blob/master/src/main/sharpen/core/CSharpBuilder.java#L2848

这里有一个简单的示例项目: https://github.com/revaultch/jdt-sample

在这里进行 Junit 测试: https://github.com/revaultch/jdt-sample/blob/master/src/test/java/ch/revault/jdt/test/VisitorTest.java

访问者是一个非常棒的工具,但解决特定问题的正确方法并不总是让单个访问者耐心等待其访问方法被调用......你问的问题就是这样一个例子情况。

让我们重新表述一下您要执行的操作:

  1. 你要识别每一个赋值(即leftSide = rightSide

  2. 对于每一个赋值,你要判断左边的性质(即是局部变量还是字段访问),如果确实是字段访问,您想要构建一个对应于该字段的 "path"(即源对象,后跟一系列方法调用或字段访问,并以字段访问结束)。

  3. 对于每一个assignment,你都想确定一个类似的"path"对应右手边。

我认为您已经解决了第 1 点:您只需创建一个扩展 org.eclipse.jdt.core.dom.ASTVisitor 的 class;在那里,您覆盖 #visit(Assignment) 方法。最后,在适当的地方实例化您的访问者 class,并让它访问 AST 树,从满足您需求的任何节点开始(很可能是 CompilationUnitTypeDeclarationMethodDeclaration).

然后呢? #visit(Assignment) 方法确实收到了一个 Assignment 节点。直接在该对象上,您可以获得左侧和右侧的表达式(assignment.getLeftHandSide()assignment.getRightHandSide())。正如您提到的,两者都是 Expression,这可能会变得非常复杂,那么我们如何从这些子树中提取干净的线性 "path"?访问者当然是最好的方法,但问题是,应该使用 distinct 访问者来完成,而不是让第一个访问者(捕获 Assignments ) 继续其下降任一侧的表达式。从技术上讲,使用单个访问者可以完成所有操作,但这将涉及该访问者内部的重要状态管理。无论如何,我非常确信这种管理的复杂性会如此之高,以至于这种实施实际上会比不同的访问者方法效率低。

所以我们可以这样想象:

class MyAssignmentListVisitor extends ASTVisitor {
    @Override
    public boolean visit(Assignment assignment) {
        FieldAccessLineralizationVisitor leftHandSideVisitor = new FieldAccessLineralizationVisitor();
        assignment.getLeftHandSide().accept(leftHandSideVisitor);
        LinearFieldAccess leftHandSidePath = leftHandSideVisitor.asLinearFieldAccess();

        FieldAccessLineralizationVisitor rightHandSideVisitor = new FieldAccessLineralizationVisitor();
        assignment.getRightHandSide().accept(rightHandSideVisitor);
        LinearFieldAccess rightHandSidePath = rightHandSideVisitor.asLinearFieldAccess();

        processAssigment(leftHandSidePath, rightHandSidePath);

        return true;
    }
}

class FieldAccessLineralizationVisitor extends ASTVisitor {

    List<?> significantFieldAccessParts = [...];

    // ... various visit method expecting concrete subtypes of Expression ...

    @Override
    public boolean visit(Assignment assignment) {
        // Found an assignment inside an assignment; ignore its
        // left hand side, as it does not affect the "path" for 
        // the assignment currently being investigated

        assignment.getRightHandSide().accept(this);

        return false;
    }
}

请注意此代码中的 MyAssignmentListVisitor.visit(Assignment) returns true,以指示应递归检查分配的子项。乍一看这听起来没有必要,Java 语言确实支持多种结构,其中一个赋值可以包含其他赋值;例如考虑以下极端情况:

(varA = someObject).someField = varB = (varC = new SomeClass(varD = "string").someField);

出于同样的原因,假设赋值的 "resulting value" 是它的右侧,在表达式线性化期间只访问赋值的右侧。在这种情况下,左侧只是一个可以安全忽略的副作用。

鉴于我不知道您的特定情况所需的信息的性质,因此我不会进一步对路径的实际建模方式进行原型设计。为左侧表达式和右侧表达式分别创建不同的访问者 classes 可能更合适,例如,为了更好地处理右侧可能实际上涉及多个 variables/fields/method 通过二元运算符组合的调用。这必须由您来决定。

关于要讨论的 AST 树的访问者遍历还有一些主要问题,即,通过依赖默认的节点遍历顺序,您失去了获取每个节点之间关系信息的机会。例如,给定表达式 this.someMethod(this.fieldA).fieldB,您将看到类似于以下序列的内容:

FieldAccess      => corresponding to the whole expression
MethodInvovation => corresponding to this.someMethod(this.fieldA)
ThisExpression
SimpleName ("someMethod")
FieldAccess      => corresponding to this.fieldA
ThisExpression
SimpleName ("fieldA")
SimpleName ("fieldB")

您根本无法从这一系列事件中实际推断出线性化表达式。相反,您将希望显式拦截每个节点,并仅在适当的情况下以适当的顺序显式递归节点的子节点。例如,我们可以这样做:

    @Override
    public boolean visit(FieldAccess fieldAccess) {
        // FieldAccess :: <expression>.<name>

        // First descend on the "subject" of the field access
        fieldAccess.getExpression().accept(this);

        // Then append the name of the accessed field itself
        this.path.append(fieldAccess.getName().getIdentifier());

        return false;
    }

    @Override
    public boolean visit(MethodInvocation methodInvocation) {
        // MethodInvocation :: <expression>.<methodName><<typeArguments>>(arguments)

        // First descend on the "subject" of the method invocation
        methodInvocation.getExpression().accept(this);

        // Then append the name of the accessed field itself
        this.path.append(methodAccess.getName().getIdentifier() + "()");

        return false;
    }

    @Override
    public boolean visit(ThisExpression thisExpression) {
        // ThisExpression :: [<qualifier>.] this

        // I will ignore the qualifier part for now, it will be up
        // to you to determine if it is pertinent
        this.path.append("this");

        return false;
    }

根据前面的示例,这些方法将按以下顺序在 path 中收集:thissomeMethod()fieldB。我相信,这非常接近您正在寻找的东西。如果您想收集所有字段 access/method 调用序列(例如,您希望访问者同时访问 return this,someMethod(),fieldBthis,fieldA),那么您可以重写 visit(MethodInvocation) 方法大致类似这样:

    @Override
    public boolean visit(MethodInvocation methodInvocation) {
        // MethodInvocation :: <expression>.<methodName><<typeArguments>>(arguments)

        // First descend on the "subject" of the method invocation
        methodInvocation.getExpression().accept(this);

        // Then append the name of the accessed field itself
        this.path.append(methodAccess.getName().getIdentifier() + "()");

        // Now deal with method arguments, each within its own, distinct access chain
        for (Expression arg : methodInvocation.getArguments()) {
            LinearPath orginalPath = this.path;
            this.path = new LinearPath();

            arg.accept(this);

            this.collectedPaths.append(this.path);
            this.path = originalPath;
        }

        return false;
    }

最后,如果您有兴趣了解路径中每一步的值类型,则必须查看与每个节点关联的绑定对象,例如:methodInvocation.resolveMethodBinding().getDeclaringClass()。但是请注意,必须在构建 AST 树时明确请求绑定解析。

上面的代码无法正确处理更多的语言结构;不过,我相信您应该能够自己解决这些剩余的问题。如果您需要查看参考实现,请查看 class org.eclipse.jdt.internal.core.dom.rewrite.ASTRewriteFlattener,它基本上从现有的 AST 树重建 Java 源代码;虽然这个特定的访问者比大多数其他 ASTVisitor 大得多,但它更容易理解。

更新以响应 OP 的编辑 #2

这是您最近一次编辑后的更新起点。处理的case还有很多,不过那更符合你的具体问题。另请注意,虽然我使用了大量 instanceof 检查(因为目前这对我来说更容易,因为我是在一个简单的文本编辑器中编写代码,并且没有对 ASTNode 常量进行代码完成),你可以选择 node.getNodeType() 上的 switch 语句,这通常会更有效。

class ConstCheckVisitor extends ASTVisitor {

    @Override
    public boolean visit(MethodInvocation methodInvocation) {    
        if (isConst(methodInvocation.getExpression())) {
            if (isConst(methodInvocation.resolveMethodBinding().getMethodDeclaration()))
                reportInvokingNonConstMethodOnConstSubject(methodInvocation);
        }

        return true;
    }

    @Override
    public boolean visit(Assignment assignment) {
        if (isConst(assignment.getLeftHandSide())) {
            if ( /* assignment to @Const value is not acceptable in the current situation */ )
                reportAssignmentToConst(assignment.getLeftHandSide());

            // FIXME: I assume here that aliasing a @Const value to
            //        another @Const value is acceptable. Is that right?

        } else if (isImplicitelyConst(assigment.getLeftHandSide())) {
            reportAssignmentToImplicitConst(assignment.getLeftHandSide());        

        } else if (isConst(assignment.getRightHandSide())) {
            reportAliasing(assignment.getRightHandSide());
        }

        return true;
    }

    private boolean isConst(Expression expression) {
        if (expression instanceof FieldAccess)
            return (isConst(((FieldAccess) expression).resolveFieldBinding()));

        if (expression instanceof SuperFieldAccess)
            return isConst(((SuperFieldAccess) expression).resolveFieldBinding());

        if (expression instanceof Name)
            return isConst(((Name) expression).resolveBinding());

        if (expression instanceof ArrayAccess)
            return isConst(((ArrayAccess) expression).getArray());

        if (expression instanceof Assignment)
            return isConst(((Assignment) expression).getRightHandSide());

        return false;
    }

    private boolean isImplicitConst(Expression expression) {
        // Check if field is actually accessed through a @Const chain
        if (expression instanceof FieldAccess)
            return isConst((FieldAccess expression).getExpression()) ||
                   isimplicitConst((FieldAccess expression).getExpression());

        // FIXME: Not sure about the effect of MethodInvocation, assuming
        //        that its subject is const or implicitly const

        return false;
    }

    private boolean isConst(IBinding binding) {
        if ((binding instanceof IVariableBinding) || (binding instanceof IMethodBinding))
            return containsConstAnnotation(binding.getAnnotations());

        return false;
    }
}

希望对您有所帮助。

首先引用我发布的一个答案:

You will have to work with bindings. To have bindings available, that means resolveBinding() not returning null, possibly additional steps I have posted are necessary.

以下访问者应该可以帮助您做您想做的事:

class AssignmentVisitor extends ASTVisitor {

    public boolean visit(Assignment node) {
        ensureConstAnnotationNotViolated(node);
        return super.visit(node);
    }

    private void ensureConstAnnotationNotViolated(Assignment node) {
        Expression leftHandSide = node.getLeftHandSide();
        if (leftHandSide.getNodeType() == ASTNode.FIELD_ACCESS) {
            FieldAccess fieldAccess = (FieldAccess) leftHandSide;
            // access field IVariableBinding
            fieldAccess.resolveFieldBinding();
            // access IAnnotationBindings e.g. your @const
            fieldAccess.resolveFieldBinding().getAnnotations();
            // access field ITypeBinding
            fieldAccess.getExpression().resolveTypeBinding();
        } else {
            // TODO: check possible other cases
        }

    }
}