SUN ET AL.: ON-ROAD VEHICLE DETECTION: A REVIEW
5.1.7 Vehicle Lights
Most of the cues discussed above are not helpful for nighttime vehicle detection—it would be difficult or impossible to detect shadows, horizontal/vertical edges, or corners in images obtained at night conditions. A salient visual feature during night time is the vehicle lights. Cucchiara and Piccardi  have used morphological analysis to detect vehicle light pairs in a narrow inspection area. The morphological operator also considered the shape, size, and minimal distance between vehicles to provide hypotheses.
There are two types of methods that use the stereo information for vehicle detection. One uses disparity map, while the other uses an antiperspective transformation— Inverse Perspective Mapping (IPM). We assume that camera parameters have already been computed through calibration.
5.2.1 Disparity Map
The difference in the left and right images between corre- sponding pixels is called disparity. The disparities of all the image points form the disparity-map. If the parameters of the stereo rig are known, the disparity map can be converted into a 3D map of the viewed scene. Computing the disparity map is very time consuming due to the requirement of solving the correspondence problem for every pixel; however, it is possible to do it in real-time using a Pentium class processor or embedded hardware . Once the disparity map is available, all the pixels within a depth of interest according to a disparity interval are determined and accumulated in a disparity histogram. If an obstacle is present within the depth of interest, then a peak will occur at the corresponding histogram bin (i.e., similar idea to the Hough transform).
In , it was argued that, to solve the correspondence problem, area-based approaches were too computationally expensive, and disparity maps from feature-based methods were not dense enough. A local feature extractor (i.e., “structure classification”) was proposed to solve the corre- spondence problem faster. According to this approach, each pixel was classified into various categories (e.g., vertical edge pixels, horizontal edge pixels, corner edge pixels, etc.) based on the intensity differences between the pixel and its four direct neighbors. To simplify finding pixel correspondences, the optical axes of the stereo-rig were aligned in parallel (i.e., corresponding points were on the same row in each image). Accordingly, their search for corresponding pixels was reduced to a simple test (i.e., whether two pixels belong to the same category or not). Obviously, there are cases where this approach does not yield unique correspondences. To address this problem,they furtherclassified the pixels bytheir associated disparities into several bins by constructing a disparity histogram. The number of significant peaks in the histogram indicated how many possible objects were present in the images.
5.2.2 Inverse Perspective Mapping
The term “Inverse Perspective Mapping” does not corre- spond to an actual inversion of perspective mapping , which is mathematically impossible. Rather, it denotes an inversion under the additional constraint that inversely mapped points lie on the horizontal plane. If we consider a point p in the 3D space, perspective mapping implies a line
Fig. 7. Geometry of perspective mapping.
passing through this point and the center of projection N, see Fig. 7. To find the image of the point, we intersect the line with the image plane. IPM is defined by the following procedure: For a point pI 0 in the image, we trace the associated ray through N towards the horizontal plane. The intersection of the ray with the horizontal plane is the result of the inverse perspective mapping applied to the image point pI 0. If we compose both perspective and inverse perspective, the horizontal plane is mapped onto itself, while elevated parts of the scene appear distorted.
Assuming a flat road, Zhao and Yuta  used stereo vision to predict the image seen from the right camera, given the left image, using IPM. Specifically, they used IPM to transform every point in the left image to world coordinates, and reprojected them back onto the right image, which were then compared against the actual right image. In this way, they were able to find contours of objects above the ground plane. Instead of warping the right image onto the left image, Bertozzi and Broggi ,  computed the IPM of both the right and left images. Then, they took the difference between the two remapped left and right images. Due to the flat-road assumption, anything elevating out from the road was detected by looking for large clusters of nonzero pixels in the difference image. In the ideal case, the difference image contains two triangles for each obstacle that correspond to the left and right boundaries of the obstacle (see Fig. 8e). This is because, except for those pixels on the left and right boundaries of the obstacle, all other pixels are the same in the left and right remapped images. Locating those triangles, however, was very difficult due to texture, irregular shape, and nonhomogeneous brightness of obstacles. To deal with these issues, they used a polar histogram to detect the triangles. Given a point on the road plane, the polar histogram was computed by scanning the difference image and counting the number of over threshold pixels for every straight line originating from that point. Knoeppel et al.  clustered the elevated 3D points based on their distance from the ground plane to generate hypotheses. Each hypothesis was tracked over time and further verified using Kalman filters. This system assumed that the dynamic behavior of the host vehicle was known, and the path information was stored in a dynamic map. The system was able to detect vehicles up to 150 m under normal daytime weather conditions.