to filter the run traces—leaving only instructions executed by the “Password” module.
To analyze the effectiveness of the ordering (control flow) obfuscation, statistics
on the differences between three different run traces were gathered using a modification
of Levenshtein Distance (LD), a generalization of Hamming Distance, to compute the
edit-distance—the number of assembly instruction insertions, deletions, or substitutions
needed to transform one trace into the other; we've modified LD to consider each
instruction instead of each character in the run traces. Fig. 7.4 illustrates the significant
differences that exist between the traces at the point of the obfuscated trial limitation
check. The randomized control-flow obfuscation causes significant differences in
subsequent executions of the trial limitation check—hopefully creating enough of a
deterrent for a reverse engineer by hampering live and static analysis efforts. Table 7.9
contains the statistical data that was gathered for the analysis.
A C++ implementation of Levenshtein Distance, written for this solution, can be
downloaded from http://reversingproject.info/repository.php?fileID=7_6_1. Note that
computing the edit-distance between two large files of any type can take many hours a
modern PC. For reference, the average size of three traces analyzed in this section is
10MB, and to compute the edit-distance between two of them required an average of ~20
hours of CPU time on an Intel Pentium 1.6GHz Dual-core processor. The LD
implementation employed in this analysis uses a dynamic-programming approach that
requires O(m) space; note that some reference implementations of LD require O(mn)
space since they use a (m + 1) x (n + 1) matrix which is impractical for large files .