What is the point of accept() method in Visitor pattern?

Design PatternsVisitor Pattern

Design Patterns Problem Overview


There is a lot of talk on decoupling the algorithms from the classes. But, one thing stays aside not explained.

They use visitor like this

abstract class Expr {
  public <T> T accept(Visitor<T> visitor) { return visitor.visit(this); }
}

class ExprVisitor extends Visitor{
  public Integer visit(Num num) {
    return num.value;
  }

  public Integer visit(Sum sum) {
    return sum.getLeft().accept(this) + sum.getRight().accept(this);
  }

  public Integer visit(Prod prod) {
    return prod.getLeft().accept(this) * prod.getRight().accept(this);
  }

Instead of calling visit(element) directly, Visitor asks the element to call its visit method. It contradicts the declared idea of class unawareness about visitors.

PS1 Please explain with your own words or point to exact explanation. Because two responses I got refer to something general and uncertain.

PS2 My guess: Since getLeft() returns the basic Expression, calling visit(getLeft()) would result in visit(Expression), whereas getLeft() calling visit(this) will result in another, more appropriate, visit invocation. So, accept() performs the type conversion (aka casting).

PS3 Scala's Pattern Matching = Visitor Pattern on Steroid shows how much simpler the Visitor pattern is without the accept method. Wikipedia adds to this statement: by linking a paper showing "that accept() methods are unnecessary when reflection is available; introduces term 'Walkabout' for the technique."

Design Patterns Solutions


Solution 1 - Design Patterns

The visitor pattern's visit/accept constructs are a necessary evil due to C-like languages' (C#, Java, etc.) semantics. The goal of the visitor pattern is to use double-dispatch to route your call as you'd expect from reading the code.

Normally when the visitor pattern is used, an object hierarchy is involved where all the nodes are derived from a base Node type, referred to henceforth as Node. Instinctively, we'd write it like this:

Node root = GetTreeRoot();
new MyVisitor().visit(root);

Herein lies the problem. If our MyVisitor class was defined like the following:

class MyVisitor implements IVisitor {
  void visit(CarNode node);
  void visit(TrainNode node);
  void visit(PlaneNode node);
  void visit(Node node);
}

If, at runtime, regardless of the actual type that root is, our call would go into the overload visit(Node node). This would be true for all variables declared of type Node. Why is this? Because Java and other C-like languages only consider the static type, or the type that the variable is declared as, of the parameter when deciding which overload to call. Java doesn't take the extra step to ask, for every method call, at runtime, "Okay, what is the dynamic type of root? Oh, I see. It's a TrainNode. Let's see if there's any method in MyVisitor which accepts a parameter of type TrainNode...". The compiler, at compile-time, determines which is the method that will be called. (If Java indeed did inspect the arguments' dynamic types, performance would be pretty terrible.)

Java does give us one tool for taking into account the runtime (i.e. dynamic) type of an object when a method is called -- virtual method dispatch. When we call a virtual method, the call actually goes to a table in memory that consists of function pointers. Each type has a table. If a particular method is overridden by a class, that class' function table entry will contain the address of the overridden function. If the class doesn't override a method, it will contain a pointer to the base class' implementation. This still incurs a performance overhead (each method call will basically be dereferencing two pointers: one pointing to the type's function table, and another of function itself), but it's still faster than having to inspect parameter types.

The goal of the visitor pattern is to accomplish double-dispatch -- not only is the type of the call target considered (MyVisitor, via virtual methods), but also the type of the parameter (what type of Node are we looking at)? The Visitor pattern allows us to do this by the visit/accept combination.

By changing our line to this:

root.accept(new MyVisitor());

We can get what we want: via virtual method dispatch, we enter the correct accept() call as implemented by the subclass -- in our example with TrainElement, we'll enter TrainElement's implementation of accept():

class TrainNode extends Node implements IVisitable {
  void accept(IVisitor v) {
    v.visit(this);
  }
}

What does the compiler know at this point, inside the scope of TrainNode's accept? It knows that the static type of this is a TrainNode. This is an important additional shred of information that the compiler was not aware of in our caller's scope: there, all it knew about root was that it was a Node. Now the compiler knows that this (root) is not just a Node, but it's actually a TrainNode. In consequence, the one line found inside accept(): v.visit(this), means something else entirely. The compiler will now look for an overload of visit() that takes a TrainNode. If it can't find one, it'll then compile the call to an overload that takes a Node. If neither exist, you'll get a compilation error (unless you have an overload that takes object). Execution will thus enter what we had intended all along: MyVisitor's implementation of visit(TrainNode e). No casts were needed, and, most importantly, no reflection was needed. Thus, the overhead of this mechanism is rather low: it only consists of pointer references and nothing else.

You're right in your question -- we can use a cast and get the correct behavior. However, often, we don't even know what type Node is. Take the case of the following hierarchy:

abstract class Node { ... }
abstract class BinaryNode extends Node { Node left, right; }
abstract class AdditionNode extends BinaryNode { }
abstract class MultiplicationNode extends BinaryNode { }
abstract class LiteralNode { int value; }

And we were writing a simple compiler which parses a source file and produces a object hierarchy that conforms to the specification above. If we were writing an interpreter for the hierarchy implemented as a Visitor:

class Interpreter implements IVisitor<int> {
  int visit(AdditionNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this); 
    return left + right;
  }
  int visit(MultiplicationNode n) {
    int left = n.left.accept(this);
    int right = n.right.accept(this);
    return left * right;
  }
  int visit(LiteralNode n) {
    return n.value;
  }
}

Casting wouldn't get us very far, since we don't know the types of left or right in the visit() methods. Our parser would most likely also just return an object of type Node which pointed at the root of the hierarchy as well, so we can't cast that safely either. So our simple interpreter can look like:

Node program = parse(args[0]);
int result = program.accept(new Interpreter());
System.out.println("Output: " + result);

The visitor pattern allows us to do something very powerful: given an object hierarchy, it allows us to create modular operations that operate over the hierarchy without needing requiring to put the code in the hierarchy's class itself. The visitor pattern is used widely, for example, in compiler construction. Given the syntax tree of a particular program, many visitors are written that operate on that tree: type checking, optimizations, machine code emission are all usually implemented as different visitors. In the case of the optimization visitor, it can even output a new syntax tree given the input tree.

It has its drawbacks, of course: if we add a new type into the hierarchy, we need to also add a visit() method for that new type into the IVisitor interface, and create stub (or full) implementations in all of our visitors. We also need to add the accept() method too, for the reasons described above. If performance doesn't mean that much to you, there are solutions for writing visitors without needing the accept(), but they normally involve reflection and thus can incur quite a large overhead.

Solution 2 - Design Patterns

Of course that would be silly if that was the only way that Accept is implemented.

But it is not.

For example, visitors are really really useful when dealing with hierarchies in which case the implementation of a non-terminal node might be something like this

interface IAcceptVisitor<T> {
  void Accept(IVisit<T> visitor);
}
class HierarchyNode : IAcceptVisitor<HierarchyNode> {
  public void Accept(IVisit<T> visitor) {
    visitor.visit(this);
    foreach(var n in this.children)
      n.Accept(visitor);
  }

  private IEnumerable<HierarchyNode> children;
  ....
}

You see? What you describe as stupid is the solution for traversing hierarchies.

Here is a much longer and in depth article that made me understand visitor.

Edit: To clarify: The visitor's Visit method contains logic to be applied to a node. The node's Accept method contains logic on how to navigate to adjacent nodes. The case where you only double dispatch is a special case where there are simply no adjacent nodes to navigate to.

Solution 3 - Design Patterns

The purpose of the Visitor pattern is to ensure that objects know when the visitor is finished with them and have departed, so the classes can perform any necessary cleanup afterward. It also allows classes to expose their internals "temporarily" as 'ref' parameters, and know that the internals will no longer be exposed once the visitor is gone. In cases where no cleanup is necessary, the visitor pattern isn't terribly useful. Classes which do neither of these things may not benefit from the visitor pattern, but code which is written to use the visitor pattern will be usable with future classes that may require cleanup after access.

For example, suppose one has a data structure holding many strings that should be updated atomically, but the class holding the data structure doesn't know precisely what types of atomic updates should be performed (e.g. if one thread wants to replace all occurrences of "X", while another thread wants to replace any sequence of digits with a sequence that is numerically one higher, both threads' operations should succeed; if each thread simply read out a string, performed its updates, and wrote it back, the second thread to write back its string would overwrite the first). One way to accomplish this would be to have each thread acquire a lock, perform its operation, and release the lock. Unfortunately, if locks are exposed in that way, the data structure would have no way of preventing someone from acquiring a lock and never releasing it.

The Visitor pattern offers (at least) three approaches to avoid that problem:

  1. It can lock a record, call the supplied function, and then unlock the record; the record could be locked forever if the supplied function falls into an endless loop, but if the supplied function returns or throws an exception, the record will be unlocked (it may be reasonable to mark the record invalid if the function throws an exception; leaving it locked is probably not a good idea). Note that it's important that if the called function attempts to acquire other locks, deadlock could result.
  2. On some platforms, it can pass a storage location holding the string as a 'ref' parameter. That function could then copy the string, compute a new string based upon the copied string, attempt to CompareExchange the old string to the new one, and repeat the whole process if the CompareExchange fails.
  3. It can make a copy of the string, call the supplied function on the string, then use CompareExchange itself to attempt to update the original, and repeat the whole process if the CompareExchange fails.
Without the visitor pattern, performing atomic updates would require exposing locks and risking failure if calling software fails to follow a strict locking/unlocking protocol. With the Visitor pattern, atomic updates can be done relatively safely.

Solution 4 - Design Patterns

The classes that require modification must all implement the 'accept' method. Clients call this accept method to perform some new action on that family of classes thereby extending their functionality. Clients are able to use this one accept method to perform a wide range of new actions by passing in a different visitor class for each specific action. A visitor class contains multiple overridden visit methods defining how to achieve that same specific action for every class within the family. These visit methods get passed an instance on which to work.

Visitors are useful if you are frequently adding, altering or removing functionality to a stable family of classes because each item of functionality is defined seperately in each visitor class and the classes themselves do not need changing. If the family of classes is not stable then the visitor pattern may be of less use, because many visitors need changing each time a class is added or removed.

Solution 5 - Design Patterns

A good example is in source code compilation:

interface CompilingVisitor {
   build(SourceFile source);
}

Clients can implement a JavaBuilder, RubyBuilder, XMLValidator, etc. and the implementation for collecting and visiting all the source files in a project does not need to change.

This would be a bad pattern if you have separate classes for each source file type:

interface CompilingVisitor {
   build(JavaSourceFile source);
   build(RubySourceFile source);
   build(XMLSourceFile source);
}

It comes down to context and what parts of the system you want to be extensible.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionValView Question on Stackoverflow
Solution 1 - Design PatternsatanamirView Answer on Stackoverflow
Solution 2 - Design PatternsGeorge MauerView Answer on Stackoverflow
Solution 3 - Design PatternssupercatView Answer on Stackoverflow
Solution 4 - Design Patternsandrew pateView Answer on Stackoverflow
Solution 5 - Design PatternsGarrett HallView Answer on Stackoverflow