have ignored edge frequency. But if we consider that all edges of level 3 are infrequent, it should be clear that when we start the search, and when we end the search, the prior- ity queue will contain only nodes that are reached through an infrequent edge and thus we would have to use it. We have also assumed that there are no pre-computed paths; otherwise, the search process is the same, but instead of fol- lowing direct neighbors of the node we also follow indirect neighbors connected through pre-computed paths.

In this example the maximum number of distinct nodes that the algorithm would have to consider is 32 at the start and end areas, 32 at the level 2 areas, 64 at level 1, and 7 nodes along upgraded edges. This is much smaller than the potential nodes examined by A* which in the worst case is the entire network, 784 nodes.

7.2

# Online path re-computation

When we compute a fastest route, the predictor function F is used to estimate driving conditions throughout the en- tire trip, but it is possible that our initial estimate is wrong. For example, if we computed the route using a prediction of good weather for the duration of the trip, but halfway into the trip a snow storm happens, our fastest path may be far from optimal. Another example of changing condi- tions would be an unexpected road closure or an accident. If we plan a complete trip before the starting time using a static model, there is not much that the system can do about these unexpected cases (e.g., when we ask a site such as MapQuest to give us a route before the trip starts). But if we are operating in an online navigation environment, we can do better, as it is possible to recompute the best route as driving conditions change. In this case we can just ap- ply the algorithm presented in the previous section with a starting node changed to the current position, starting time to the current time, and an updated forecast model.

In an online navigation system, having access to an effi- cient route computation algorithm is even more critical than in the static route computation model, and the power of our proposed method would be more evident. Another possible change to our algorithm, when forecast is updated, would be to invalidate our speed model for edges near the vehicle’s current location, as we may know the exact speed for those, and use mined speed patterns only for farther away edges.

8.

# EXPERIMENTAL EVALUATION

In this section we perform a thorough analysis of the adap- tive fastest path finding algorithm proposed in the paper, and compare its performance against a basic A* implemen- tation, and a version of the adaptive of our own algorithm that does not perform area pre-computation. All the exper- iments were conducted on a single core of an AMD Athlon 64 X2 Dual Core processor with 2GB of RAM. The system ran cygwin and gcc 3.4.4.

8.1

# Data Synthesis

In all of our experiments we used real road maps from ar- eas of the United States. We used three maps. San Francisco Bay area (90 by 125 miles) with 175,343 nodes, and 223,606 edges. A map for the entire state of Illinois with 831,524 nodes, and 1,048,080 edges. A smaller map for San Joaquin, CA with 18,496 nodes, and 24,123 edges. We simulated dif- ferent traffic conditions using the Network-based Generator of Moving Objects by Thomas Brinkhoff [3], which is a well

802

known traffic simulator. For each map we ran two simula- tions, one with 10,000 objects used to generate rush hour like speed conditions, and one with 1,000 objects used to sim- ulate non-rush hour speed conditions. In each simulation we defined two object classes, cars with faster speeds, and trucks with slower speeds. We also incorporated a weather factor into the simulation by running the simulation with and without external objects, which in the data generator can be used to slow down certain areas of the road network as if bad weather were occurring. The output of the simu- lation was a list of edge observations of the form edge id, car id, time, weather, speed. The output files were a few hundred megabytes long. Using this information we mine speed patterns for each edge, such patterns involve the di- mensions of time, weather, and vehicle type.

In most of the experiments we compare three methods. The first is an implementation of A∗, we run this algorithm on the entire road network. We maintain a priority queue of paths to expand. At each iteration we select the path with minimal g(n) + h(n), where g(n) is the current cost of the path, and h(n) is the expected cost to the goal. For each path we update g(n) by retrieving the appropriate speed for the edge at the time when it will be taken, and for the type of vehicle for which the route is being planned, and for the forecasted weather. A∗ always finds the fastest possible path, and is thus our baseline for correctness. The second algorithm Hier is our adaptive fastest path algorithm imple- mented without area pre-computation, i.e., no fastest paths have been pre-computed at all, and without considering road upgrades, i.e., small roads through an area that are faster than roads in the area border. The third algorithm, Adapt is the fastest path algorithm proposed in this paper.

8.2

# Query Length

In this set of experiments we vary the distance between the starting node and the ending node of the query. For this experiment we used the San Francisco road network. The road network was partitioned with edges of level 1, and level 2. We artificially upgraded a single random path in 20% of the lowest level areas, and set the speed for edges in upgraded paths higher than the average speed of the border edges of the area. We pre-compute fastest paths in 30% of the lowest level areas. Results are the average of 100 random queries. We varied the average distance between the starting and ending nodes. The longer the distance the larger the search space. Query Length measures the distance of the end points in the query as a percentage of the map diameter.

In Figure 5, we see the number of expanded nodes for the three algorithms. We see that the number of nodes expanded by A∗ grows very rapidly for large paths. This is expected as there are many more possible paths to consider between nodes that are separated by a large number of edges. At the same time we see that Hier and Adapt expand a number of paths that is almost constant. The reason is that both algorithms limit their search to larger roads and only go into individual areas at the start or end of the search. We see that although Adapt has to consider all upgraded edges, it only expands slightly more nodes than Hier which ignores those edges.

Figure 6 presents the travel time for the three methods. We see that A∗ always gives us the fastest path. The path found by Adapt is almost as good as the A∗ path, but at