Automatic Vehicles Detection, Classification and Counting Techniques / Survey

Vehicle detection (VD) plays a very essential role in Intelligent Transportation Systems (ITS) that have been intensively studied within the past years. The need for intelligent facilities expanded because the total number of vehicles is increasing rapidly in urban zones. Traffic monitoring is an important element in the intelligent transportation system, which involves the detection, classification, tracking, and counting of vehicles. One of the key advantages of traffic video detection is that it provides traffic supervisors with the means to decrease congestion and improve highway planning. Vehicle detection in videos combines image processing in realtime with computerized pattern recognition in flexible stages. The real-time processing is very critical to keep the appropriate functionality of automated or continuously working systems. VD in road traffics has numerous applications in the transportation engineering field. In this review, different automated VD systems have been surveyed, with a focus on systems where the rectilinear stationary camera is positioned above intersections in the road rather than being mounted on the vehicle. Generally, three steps are utilized to acquire traffic condition information, including background subtraction (BS), vehicle detection and vehicle counting. First, we illustrate the concept of vehicle detection and discuss background subtraction for acquiring only moving objects. Then a variety of algorithms and techniques developed to detect vehicles are discussed beside illustrating their advantages and limitations. Finally, some limitations shared between the systems are demonstrated, such as the definition of ROI, focusing on only one aspect of detection, and the variation of accuracy with quality of videos. At the point when one can detect and classify vehicles, then it is probable to more improve the flow of the traffic and even give enormous information that can be valuable for many applications in the future.


I. Introduction
The importance of efficient vehicle detection (VD) is increasing with the expansion of road networks and number of vehicles. In the last years, the need for video monitoring has become essential for automatic traffic analysis to solve the issue of traffic [1]. Traffic analysis may include counting the vehicles in a region per period time and classifying them. Usually, VD is primarily based on an infrared sensor for the detection or the video-based solution [2]. With the systems based on video, more benefits are ensured, such as the large amount of traffic information, ease of installation, scalability with image processing techniques, and prevention of some disadvantages that include high cost and the need for calibration and periodic maintenance. Accumulating traffic information for instance position and motion of vehicles is essential for transportation maintenance and planning of motorway networks. The position and motion of vehicles are required for the system to be able to observe the movement on highway and motorway networks [3].
As a result, high quality of traffic monitoring leads to prevent the growing congestion level, with associated environmental pollution, high risk of accidents, and time wasted during transportation [4].
The main objective of this survey is to provide a general overview of the VD systems and review the existing methods for each of its processing steps. The remaining sections are structured as the following sections: Section II presents the vehicle detection systems and displays its applications. Section III reviews the current methods for background-foreground subtraction. Section IV discusses the related works for the current VD systems. Section V presents a comparison among all the systems discussed in the previous section. Section VI completes this paper with conclusions.

II. Vehicle Detection
Especially in the latest few years, many systems were developed to reach robust and effective VD systems which require accurate measurements of vehicle position and motion.
Detection, classification and counting of vehicles contribute to a significant part in traffic stream estimation. VD involves labelling the regions that contain vehicles in the scene video [2]. Overall, the VD algorithms must be separated into three main phases: image acquisition, generation of candidate concerning vehicles, and verification of these candidates, as shown in Figure-1 [5].

Verification of Candidates
The necessity of VD is derived from the need for traffic monitoring, which contains classifying and counting the categories of vehicles passing in that area. This important information can be employed for various applications. VD systems have wide abilities, precision, and adaptability of diverse detection categories; for example, speed estimation, vehicle classification and counting, parked vehicles detection, accident detection, wrong-side vehicle monitoring, in addition to other classic traffic information such as high occupancy vehicle lanes (HOV), gap between vehicles, queue detection, etc. Any of these systems may contain subsystems, such as flow control for the traffic, station gate controller, remote facility monitoring, and central control room [2].

III. Background Subtraction (BS)
One of the important stages of most computer vision systems is BS, which is mainly used to detect moving objects in sequence frames of a video. Any frame of a video should be separated into two different parts; foreground, which contains pixels that include the objects of interest, and background which contains all pixels that do not include an object of interest such as trees, pavement, buildings, and sky, as shown in Figure-2 [6].
The essential logic of BS is detecting objects from a distinction between the current frame and the frame of reference, referred to as reference background, which is created by calculating the average of images in a specific period of time. When the difference of the pixels is above a threshold, then it leads to its classification as a foreground. After isolating foreground and background from the image, we can apply preprocessing operations such as closing, opening, erosion, and dilation on the foreground pixels to enhance and remove noise. The reference background is refreshed constantly over few times with new frames to adjust to the dynamic scene varieties [5].
In any background subtraction method, the difficulties that may face the system to produce robust and effective vehicle detection method must be solved. Such difficulties include camera vibration, different lighting , unclear video, and shadows [7]

IV. Related works
In this section, we cover the previous related works about automatic vehicle detection systems. There are various proposed vehicle detection algorithms because no unique algorithm is capable to

Najm and Ali
Iraqi Journal of Science, 2020, Vol. 61, No. 7, pp: 1811-1822 1814 deal with all of the challenges in this region. There are several problems that affect vehicle detection algorithm and need to be solved. Several methods are demonstrated below. Lei [9] presented a system for automatic VD and counting which depended on videos . The two primary operations implemented in the suggested system are the adaptation in background assessment, which allows a forceful moving detection specifically in complex sights, and Gaussian shadow elimination which depends on HSV color space and is capable of dealing with the dissimilar intensity and size of shadow. After these two processes, all the frames, which contain unstopped vehicles, are gained and the next counting process is produced via a technique named "virtual detector". Figure-3 shows the architecture for all the steps of the proposed system for automatic VD and counting. The suggested system's resolutions are relatively different with difficult situations; for example, in relation to shadows and ghosts that are automatically removed. Actually, the system is not able to distinguish vehicle categories (car, motor-cycle, and lorry). Qian et al. [10] proposed a video VD system for the estimation of the position and size of vehicles based on a combination of Local-Binary-Pattern (LBP) and motion histogram. Firstly, LBP texture description was utilized for background modulation and updating the sequence frames. Secondly, the VD was employing the motion histogram. Lastly, the shadow from the identified vehicles area was reduced, which increased the accuracy of VD. Experimental results of vehicles database illustrated that this system has an improved performance . However, this system has some limitations such as its focus on a simple set of features to detect vehicles, while ignoring the complex features. Besides, there is no vehicle classification and counting in this system. Xu et al. [11] suggested a vehicle detection method based on prediction on real-time feature learning and linking the ARMA model and the AdaBoost algorithm applied on the video. The algorithm trains the AdaBoost classifier based on HOG and Haar features. This system takes the target's previous information into account and extends the AdaBoost algorithm in the time dimension to improve the accuracy of real-time detection, as shown in Figure-4.
The experimental results illustrated that the VD depending on the frames sequence, which enhanced by the system by creating a time sequence model, where the system enables the efficiently decreased time complications. The AdaBoost + ARMA decrease the cost of the average time for each image. Tursun [12] proposed a computer vision system for detecting and counting vehicles on the highways. The suggested system depends on an automatic traffic scene taking by a camera placed above the streets and calculates the overall vehicles that pass over the roads. An image with the moving vehicles is extracted by double-difference algorithm and the counting of vehicles is achieved by monitoring vehicles moving inside a track area, termed as the virtual loop. This approach is tested on video surveillance records of a street that has an intermediate-grade traffic size.
The results of the system, under different day time conditions, showed that it can count vehicles with high accuracy, but the detection accuracy depends on the visual angel beside the position of the camera.
Mahmood [13] suggested an advanced system to improve applications for securing vehicles, which included three stages: VD, detection of the face of driver, and recognition. Vehicle and face detection was accomplished by using the Adaptive Boosting algorithm and Haar-like features.
The use of the suggested system guarantees that only reliable vehicles are permissible to park in any automobile zone used for the park. The results showed the probability of the advanced method to be used in any automobile zone used for the park. Nevertheless, the system detects vehicles from the front and sideways vision, which is a limitation to the entry video. Arya [14] proposed a design and developed a real-time VD and tracking algorithm which concentrates on the path of the moving objects. This method keeps the set of pixels in the foreground that can be feasible to be vehicles while rejecting the remaining. Consequently, the system discards the objects which appear not to contain vehicles, while combining the candidate vehicles. Also it suggests an effective approach to eliminate the undesirable noise which corrupts the foreground zone of the specified frame, increasing the effectiveness of the whole VD system in realistic situations.
This system was evaluated by comparison with standard systems and showed high quality. Besides, it should be made clear that the system performance depends on the features of the input video.
Seenouvong [15] suggested an approach for detection and counting vehicles depending on computer vision techniques. The proposed approach uses a BS process to discover foreground vehicles in a video sequence. Then, to detect moving vehicles with further accuracy, it employs techniques such as thresholding, adaptive morphology, and hole filling operations. For counting vehicles, the system uses a virtual detection for selected zones. The experimental results illustrated that the suggested system accuracy is about 96% . The limitations in this proposed system are shown in several points; Firstly, the vehicles should be pure and not occluded within the virtual detection area. Secondly, the width of the virtual detection area should be large enough for counting the vehicles. Lastly, this simple system focuses on the background subtraction and counting, while ignoring vehicle classification which is an essential operation in the vehicle detection systems. Kamkar et al. [1] presented a system for detection, classification and counting vehicles for highways traffic monitor system, utilizing an active basis model (ABM), and confirms them depending on their reflect of symmetry. Counting and classifying vehicles are applied based on two extracted features; namely the vehicle length in the equivalent time spatial frame and the relationship calculated from the grey level co-occurrence matrix (GLCM) of the vehicle frame inside its bounding boxes. For classification, a random forest is used which classifies the vehicle into three classes: small (car), medium (van), and large (bus and truck).
The experimental results revealed the good performance of the system and its accuracy in actual situations with general problems present in the highway, such as different illumination, camera vibration, conditions of weather, and shadows. Enhancements such as utilizing GPU programming or increasing other features depend on the texture of the vehicles classifying portion, which is proposed to enhance the system run time and performance. Zhuang et al. [16] suggested an algorithm for vehicle detection in real-time, which depends on the enhanced Haar-like features and gathering a cascade of classifiers with motion detection. It adapts a background extractor based on visual features, supplemented by a morphological process, to acquire a foreground. This foreground maintains the features of vehicles and supplies the locations inside frames where vehicles are probable to be sited. After that, image dilation is used to achieve better foreground images. Vehicle detection process implemented only at specific area in the video, which called regions-of-interest (ROI), rather than scanning a complete regions of the frame. The system uses a cascade of robust classifier instead of a solo robust classifier, where the cascade classifier capable to enhance the performance of the detector as shown in Fig5. The suggested system was effectively estimated on the general dataset, which showed its strength and automatic performance. The results also showed that the system is very fast. In fact, one of the reasons for the high speed of this system is that it is limited in its work to detect vehicles and does not address the process of classification and counting of vehicles. In addition, it is bad in detecting vehicles in the dark.  [16].

Najm and Ali
Iraqi Journal of Science, 2020, Vol. 61, No. 7, pp: 1811-1822 1817 Aswin et al. [17] proposed a vehicle detection system for counting and classification of the vehicles during the night. The system involves preprocessing operations such the background subtraction and image segmentation. Blob analysis is performed to detect and match the headlights of the vehicle for the counting of vehicle numbers. The system also creates and compares different templates of headlights with the vehicle's headlights to classify the categories of vehicles. This effective system was implemented successfully under various night-time illumination conditions. However, the limitation of this system can be the focus on the night time detection and ignoring the detection effectiveness on the normal light time. Besides the system does not involve the essential part of traffic monitoring, which is vehicle classification. The system becomes very slow in the case of high-quality video input, due to increased computing time that leads to that the system is unsuitable for real-time detection. Ershadi et al. [18] proposed a method for VD in changing weather conditions, which focused on perspective removal by applying Modified Inverse Perspective Mapping (MIPM), Hough transform for automatic locating of lanes and lines, and Gaussian Mixture Models (GMM) for VD, besides extracting vehicle features.
The algorithm is strong and extra efficient when compared with other systems, specifically with the occlusion problem, lights variations, and weather conditions. Nevertheless, the algorithms were not experienced under camera vibrations, sandy weather, and unclear lighting, especially at nighttime. Furthermore, the system involves more limitations, such as the detection of vehicles in tunnels, steep uphill roads, and winding roads. Memon. [2] proposed a vehicle classification and counting system based on computer vision. It starts with dividing the video into a set of frames to work in background subtraction. Then it detects and counts the vehicles using Gaussian Mixture Model (GMM), . The final step is the classification of the vehicles by comparing the contour regions to the presumed values, or using bag of features (BOF) for feature extraction with support vector machine (SVM) for vehicle classification. For implementation, the proposed system develops a user interface which is used to define the region of interest, implying its need for human supervision which is one of the limitations of this system. Then the system applies image processing techniques. It also involves making comparisons in classification between Contour Comparison (CC), BOF, and SVM. As a result, it appears that the CC process is better than BoF and SVM techniques, providing classification result that is closer to the actual values, as demonstrated in Fig 6. Because this system depends on one level feature extraction, it is not effective in finding the occlusion of the vehicles, which causes less precision of classification and counting. Besides, it has the limitation of not being capable of performing VD at the night.

Najm and Ali
Iraqi Journal of Science, 2020, Vol. 61, No. 7, pp: 1811-1822 1818 In 2018 Chen [19] proposed a simple vision-based, nighttime, all-weather, neighboring vehicle detection system. It was executed by automatic multilayer thresholding method, reducing noise, associated component labeling, headlight corresponding method, and headlight classification and tracking. It was able to detect headlight position and acquire the vehicle position. The results illustrated that the system is of high accuracy and adequate speed to be used as a real-time system. In this system, the drivers acquire more information about the surrounding vehicles and are able to drive more safely, which reduces the accident rate. The proposed system gives an improved detection effectiveness when compared with the typical systems, which was demonstrated both quantitatively and qualitatively. The system could process the difficult conditions in the experiment and generated an enhanced quality of results.
The method is flexible to standard traffic situations and the results related to VD and tracking were sufficient. However, the effectiveness of the system is largely based on the feature of the scene that has been entered and the location of the camera. Moutakki [20] suggested an automatic VD and counting system. The suggested approach is capable to detect, locate, and automatically identify vehicles in the scene. It is able to calculate the capacity and features of traffic depending on the three sections of segment, class and count. The method's impact involved advancing the count system, which depends on the feature of VD and the identification over different situations , such as occlusions and lighting conditions, which represents difficult problems in recent systems. This approach implements VD and classification by removing the impacts of many issues on system efficiency. The results of the suggested approach showed an efficiency rate of 98.7%, calculated by three diverse videos, which indicates that the suggested approach is effective as a reliable VD instrument for traffic systems.
Additionally, the proposed approach shows a practical strength against variations in light and items size. Despite good accuracy, there are some limitations in the proposed system. Firstly, the vehicles should be pure and not occluded within the virtual detection area. Secondly, the width of the virtual detection area should be large enough for counting the vehicles. Additionally, there are other vehicle features that should be taken into account for vehicle detection in several setups. Manzoor et al. [21] suggested a random forest-based vehicles and model recognition method. For vehicle detection, the authors applied Histogram of Gradient (HOG) and Scale-invariant feature transform (SIFT) as feature descriptors. The bag-of-features method is applied to represent the total feature in the instance of SIFT, while the concatenation is used for HOG descriptors to create frame feature vectors. The architecture of training and testing steps in the proposed system is shown in Figure-7. It completely depends on the results that the suggested system improved, with a group of random forest build on 300-350 decision trees. The system achieved a 94.43% recognition ratio. The identification rates reduce the optimum point because of the over-fitting through the training phase.  Table-1. The authors [7] proposed a system for vehicle counting in real time, which depends on adaptive background.
The suggested system comparatively well resolves several difficult problems, like shadow, which are automatically eliminated.
This approach is unable to distinguish vehicle categories (truck, car, motorbike).
As shown in some instances and experiments results of vehicles database that this system has an improved performance The system is focusing on a simple set of features to detect vehicles, while ignoring the complex features. Besides, there is no vehicle counting and classification in this system.

3
The authors [11] proposed (VD) system is based on linking the ARMA model and the AdaBoost algorithm.
The algorithm can efficiently reduce time complication. The ARMA + AdaBoost decrease the cost of the average time for each image.
Vehicle detection is performed from the frontal and side views, which is a restriction on the input image to the system.

4
A video-based vehicle counting system suggested using virtual loop system [12].
The system counts vehicles with high accuracy under different daytime conditions.
The accuracy of the system depends on the visual angel besides the position of the camera.

5
The authors [13] suggested an automatic VD for vehicle parking in security, which includes three stages: VD, detecting the face of the driver, and recognizing the face of the driver.
The experiments show the probability of the advanced method to be used in any area for vehicle parking, like public parking areas This approach makes detection from front and sideways vision only, which is a limitation on the input video to the method.

6
The authors [14] proposed VD and tracking system in real time.
Suggests an effective method to eliminate the undesirable noise influence on the quality of the foreground, which increases the effectiveness of the whole automatic VD and tracking system.
he method is directly affected by the features of the entry videos, besides the place of the camera.

7
The authors [15] proposed VD and counting system based on computer vision The accuracy of the suggested system is about 96%.
Firstly, the vehicles should be pure and not occluded within the virtual detection area. Secondly, the width of the virtual detection area should be large enough for counting the vehicles. Thirdly, this simple system focuses

Najm and Ali
Iraqi Journal of Science, 2020, Vol. 61, No. 7, pp: 1811-1822 1820 on the background subtraction and counting, while ignoring vehicle classification which is an essential operation in the vehicle detection systems. 8 The authors [1] proposed VD classification and counting system using an active basis model (ABM) and confirmed them according to their reflection symmetry.
Experimental results show the good performance of the proposed method and its efficiency for use in traffic monitoring systems during the day night and all seasons of the year.
The Proposed method is dependent on the use with videos at diverse visions. Alterations are required to decrease the system's sensitivity to visions.

9
The authors [16] proposed real time VD with a foreground that depends on cascade classifier.
Robustness and real-time performance. Besides, The result shows that the system is very fast.
In fact, one of the reasons for the speed of this system is that it is limited in its work to detect vehicles and does not address the process of counting and classification of vehicles. In addition, it is bad in detecting vehicles in the dark.

10
The authors [17] proposed system is VD in night time system which involves preprocessing operations such as background subtraction, image segmentation, and blob analysis.
The system is implemented successfully for the detection and counting of vehicles under different night-time illumination situations.
When the input video is of high quality, it takes additional amount of computational time.
The recognition rates is reduced after reaching an optimum point, possibly because the over-fitting through the training procedure.

VI. Conclusions
In this study, we tried to focus on some specific issues concerning directly the theme of a general "survey of automatic vehicle detection". Detecting a vehicle accurately in real-time videos is one of the major research areas in the arena of computer vision; the mentioned systems are capable to be employed in different applications and environments. In real-time image processing, many difficulties can face the researchers, such as low resolution, illumination variation, dynamic objects in the background, and the difference in the background, like waving of leaves. The essential part of all methods discussed in this paper is the background subtraction technique, despite the different approaches applied in this part such as frame differencing, visual background extractor, GMM, MOG, background modeling, etc. Actually, all these approaching are performing the same function to separate the foreground from the background and to keep only the moving objects. Also, we discussed in this review the advantages and disadvantages of the presented systems. The state-of-the-art of existing methods in each key issue is discussed and the future work needed to expand the vehicle detection process in real-time videos was identified. It should be clear that all the discussed systems in this survey are suffering from shared limitations, as in the following: 1-Most of these systems require an identified discovery area called ROI. The area can be determined manually by the method operator or by a specific calculation method. This leads to an increase in the implementation time of these systems, leading to ineffectiveness in real-time, despite the high quality of these systems.

2-
The studied systems focus on a particular aspect of the process of detecting vehicles. For example, there are methods that detect vehicles in the day with high accuracy, but they are inefficient in detecting vehicles at night or in different weather conditions. In addition, there are methods that focus on the process of detecting vehicles and ignore the counting or classification processes, while some others are doing all this processes but the execution speed is not appropriate for real-time. We conclude that there is no comprehensive method that is suitable for all conditions and reliable in all aspects of the recognition process.

3-
The accuracy of these surveyed systems actually depends on the input video which should be in a specific quality to achieve the best results. This implies that the accuracy is variable depending on the changing quality of the input video. However, it is not always possible to control the input video quality; thus, these systems cannot guaranty specific quality in all conditions.