Opcode-based equality heuristics: a number of simple tests are easily recognizable at the bytecode level (e.g. ifnull/ifnonnull and ifeq/ifne), and are expected to allow predicting the outcome:
i) Null test for reference types: testing if a reference is equal to null is deemed to usually fail;
ii) Zero test and equality test: testing if a value of a primitive type (boolean, integer, double, etc) is equal to zero is deemed to usually fail, as well as the equality test between two primitive types or two references. 12
Successors-based heuristics: depending on properties of the clos- est successor nodes, we want to determine if the branch will be taken or not.
if-then without an else structures: the then part is assumed to perform some special-case processing, and is therefore predicted not to be taken.
direct successors containing a throw statement: they are of course expected not to be taken, since they raise an exception.
direct successors containing a return statement: they are also intuitively deemed to be treating special cases, such as to terminate a recursive iteration.
We implemented these heuristics and inserted additional probes into the transformed classes, in order to collect statistics on the quality of the predictions throughout the whole SPEC JVM98 benchmark suite.
We decided that the heuristics are to be evaluated in turn, starting with those for which our statistics show the highest probability of success (or hit), respectively the lowest miss rate, as detailed in the following. We chose to consider them as exclusive, i.e. as soon as a heuristic decides that a given successor node of the current branching node shall be taken, the algorithm stops there.
We found that the best sequence of heuristics, which yielded an overall miss rate of 28%, was the following:
1. Loop branches and loop entry branches are well-known to provide a hit with good confidence . In our setting, they had a total miss rate of 23%, which is not quite as good as expected.
12 This aggregation of different cases inside the same heuristic is due to the nature of the JVM instruction set, which has several uses for the ifeq/ifne opcodes. Therefore, a finer analysis would be needed to distinguish between the different cases.