17000

16000

15000

14000

nodes

13000

12000

11000

10000

9000

Adapt_nopre Adapt

0

0.2

0.4 0.6 materialized areas

0.8

1

Figure 11: Pre-computed areas vs Expanded Nodes. Depth: 2

8.5

# Road Network Size

In this experiment we compare query processing efficiency for 3 road networks in the United States of different sizes: San Joaquin (sj) with 18,496 nodes and 24,123 edges, San Francisco (sf) with 175,343 nodes and 223,606 edges, and Illinois (il) with 831,524 nodes and 1,048,080 edges. All the maps where partitioned using the top two levels of roads. Average path length was 50% of map diameter. 20% of areas had a single path upgraded.

We can see in Figure 12 that the adaptive algorithm has excellent scalability in terms of road network size. The rea- son is that the number of nodes usually grow much slower than the number of small roads, and thus our algorithm is able to significantly restrict the search space to a manage- able size even in the presence of millions of nodes and edges. This result is encouraging, as it indicates that the algorithm can handle route planning in very large maps quickly, a fac- tor that is essential when we consider that the number of queries that such system needs to handle is in the billions.

450000

400000

A* Hier Adapt

350000

300000

nodes

250000

200000

150000

100000

50000

0

sj

sf graph size

il

Figure 12: Graph size vs Expanded Nodes. Depth: 2

9.

# RELATED WORK

Shortest path computation is an area that has received extensive research attention for almost half a century. There are so many results in this area that we can only highlight a few selected publications to put our work into perspective. A good survey of shortest path methods is found in [22, 11].

804

Exact algorithms for static graphs have used the idea of edge hierarchies to reduce computation time. [19, 18] and more recently [26, 5] have looked at the techniques to decom- pose the graph using hierarchies and pre-computed selected paths. The hierarchy defined by these methods is similar in spirit to ours, but theirs is usually graph theoretic and not based on the physical size of roads. Approximate algorithms have also considered the idea of graph partitioning. [4] uses large roads to partition the graph, but they do it manu- ally, and no driving or speed patterns are considered. Other graph-theoretic algorithms have focused on improving the heuristics used by A∗. [6] is a recent example of work along this direction. It uses the concept of landmarks to improve route cost estimation. These algorithms are a complement to our method, we could directly use them to estimate h(n) and improve performance.

Shortest path computation on dynamic graphs has two in- terpretations. The first is graphs where edges are updated, deleted, and added. Under this interpretation the idea of most methods is to apply a static shortest path algorithm such as Dijsktra’s [7] only to the subset of the graph where fastest paths change after the upgrade. [10, 6] compare the performance of a few of the most popular single source dy- namic shortest path methods. [21] presents an all-pair dy- namic fastest path algorithm. The second interpretation is that edge speed is a function of time. [20] looks at this problem, they define the CapeCod pattern to match time of day to edge speed. Their algorithm is an adaptation of A*: Instead of sorting the priority queue on a scalar cost, they maintain a function of cost based on starting time. The focus of this work is to provide the complete set of fastest paths for a given time interval. This technique can be used by our algorithm in performing path pre-computation at the area level. [2] uses a statistical approach based on clustering to compute rules useful in predicting fastest paths. We are different to this work in that our approach to route finding is search, not prediction.

We make use of different data mining techniques in or- der to derive driving and speed patterns. Frequent driving patterns can be computed using frequent pattern algorithms such as [1, 15], or sequential pattern mining algorithms such as [23]. Speed patterns are computed using an efficient deci- sion tree [25] induction algorithm such as [12]. Speed factors can be clustered using a number of algorithms [17]. And the treatment of time, and other continuous attributes in speed pattern induction can be handled through the methods pre- sented in [24, 8].

Finally, our traffic database, especially when individual vehicle tracking information is available, is similar to path databases studied in the management of RFID data [13, 14]. But these models are based on the concept of bulky object movements and predictable flow patterns, which for the case of cars in a road network are not directly applicable.

10.

# CONCLUSIONS

We developed an adaptive fastest path algorithm, that bases routing decision on driving and speed patterns mined from historical data. This is a radical departure from tradi- tional algorithms that have focused only on speed and Eu- clidean distance considerations. The routes computed by our algorithm are not only fast given a set of driving con- ditions but also reflect observed driving preferences. This is in sharp contrast to existing algorithms that may send a