we believe these factors can be essential in selecting desirable driving routes in addition to being fast.
Table 1: Speed Pattern for a Particular Edge
recent incorporation of speed conditions into major route planning software from Google and Microsoft. Information on speed-affecting factors such as weather and road con- struction are readily available for most areas of the United States; and other factors such as accident data are expected to become available in the near future.
Definition 2.3. A driving pattern is a sequence s of edges e(1), e(2), . . . , e(l) that appears more than min sup times in the path database, and that is a valid path in the road net- work graph G(V, E). We define support(s) as the number of paths that contain the sequence s. We define the length of the sequence, length(s), as the number of edges that it contains.
Driving patterns are edge sequences that are frequently traversed by drivers. A path database is a set of trajectories, one per driving session. Currently, the availability of path databases is quite limited, but there is a trend for the usage of sophisticated tracking mechanisms, such as RFID enabled tags in cars that can be read by toll ways’ tag readers, road sensors, GPS devices, and cameras capable of identifying cars by their license plates. Currently, we do have edge-level traffic information available, and such data can be used to mine frequent driving patterns of length one, which can be quite useful in finding fast roads (at the edge level) that are consistently taken by drivers.
Definition 2.4. An edge forecast model F (edge id, t), re- turns a tuple (d1, d2, . . . , dk) with the expected driving con- ditions for edge edge id at time t.
The forecast model is a way for the route planner to esti- mate driving conditions at different edges in the graph. This is analogous, for example, to looking at the wether predic- tion for the day, before taking an extended trip to plan the route accordingly. An example of the forecast function may be: “At 5 pm [time], for highway 74 between Champaign and Normal [edge], Weather = rain, and Construction = no [conditions]”.
With the above definitions available, we are ready to state
our problem: Problem Statement.
Given a road network G(V, E),
a set of speed patterns S, an edge forecast model F , and a query q ← (s, e, start time), compute a fast route qr between nodes s and e starting from s at time start time, such that qr contains a large number of frequent driving patterns.
We can see from the problem definition that we are inter- ested in finding fast paths, that are aware of the expected driving conditions for the trip, and that give preference to routes that are historically preferred by drivers. This prob- lem definition encompasses factors beyond what traditional research on fastest path computation has considered. And
Figure 1: San Joaquin Road Network
A traffic tracking system can generate information on the speed conditions for different times of day for each road in the network, such information can be represented as a set of traffic observations of the form (edge id, time, speed), where edge id is an edge, time is the time when the observation took place, and speed is the observed speed. A more so- phisticated system, such as the one used to monitor the San Francisco Bay Area traffic conditions 2, can use radio- frequency tags placed in each car to track the paths tra- versed by individual vehicles. These tags can be the same ones used for automated toll collection, the city would just install readers at many non-toll roads. In this case, each traffic observation will be of the form (car id, edge id, time, speed), where car id is a vehicle identifier, and other val- ues are defined as before. Vehicle-level observations can be sorted on car id and t to generate a path database, where each entry is the sequence of edges traversed by a car dur- ing a driving session. We can use either form of data to determine frequent driving patterns. If only edge-level data is available, we can use the number of edge observations as support, but in this case only frequent length-1 patterns can be mined. If we have vehicle-level data, we can mine longer frequent driving patterns.
In addition to speed information at each edge, we can augment the traffic database with the set of driving condi- tions present during each edge observation. Given a set of a v a i l a b l e d r i v i n g f a c t o r s D 1 , . . . , D n , w e c a n a u g m e n t e a c h t r a ffi c o b s e r v a t i o n w i t h t h e t u p l e ( d 1 , . . . , d n ) w h e r e e a c h is a value for driving factor Di. Table 3 presents an example traffic database in path format, where each path stage is of d i