In recent years, UAV have been widely used in the military field. Due to its advantages of small size, low cost, low noise, high maneuverability, high concealment and superior stability, UAV can be applied to various of fields in the world for the purpose of reconnoitering enemy troops, detecting danger zone, tracking target, electronic interference, communication relay, and even completing the task of attack by carrying small-scale offensive weapons. Therefore, to detect non-cooperative UAV is necessary. To completed the task of independently identifying aerial non-cooperative UAV and monitoring in real time to border areas of different country could realize economic and extensive border surveillance and play a security role for the country and society.
Recently, on the UAV detection technology of domestic and foreign scholars is rare. However, the difficulty of the UAV detection technology is mainly that the UAV in the picture is easily affected by the position, the angle, the distance from the camera and its own structure, which makes it difficult to detect the UAV. Above problem is extremely similar to difficulty of target detection technology and needs to research and practice. At present, the common methods used in target detection are mainly following methods: the method based on the optical flow field [1], the inter-frame difference method and the background detection based on the target detection method. Optical flow target detection is a method of moving target, which the basic principle is: that a velocity vector is given to the moving target pixel, and the image will form an optical flow field, and its background will be obviously distinguished from the moving target vector. In the meanwhile, you can get the location of the target by analyzing dynamic image. This method is suitable for all kinds of backgrounds, and it has no requirement for background. Its drawbacks are that the number of pixels is too large, the distribution of optical motion fields is too wide and inaccurate, and the computation is too large. The method of inter-frame difference [2] is used to obtain the foreground of the target by using the threshold segmentation method to obtain the difference between the pixels of the adjacent two frames, whose calculation is small, and the ability of target recognition is poor when the illumination changes. In order to ensure the accuracy and integrity of UAV target detection, the traditional method of sliding window to detect UAV requires that the sample picture and the picture to be tested need to be scaled until the target of the image to be detected is less than or equal to the template size and the detection is stopped. So that the UAV which is larger than or equal to the template size can also be detected accurately after adjusting the pixel size of the whole image to be tested. However, the HOG feature [3-5] is extracted after the image is scaled. Extracting HOG feature need to traverse the whole image by sliding window [3] until the edge contour feature of high dimension can be obtained by matching, so that the traversal need to take a long time, which leads to the slow generation of feature descriptors, and the scale of each level requires a large amount of computation.
An image segmentation method based on graph theory and HOG-FLD feature fusion is proposed for UAV detection in this paper. This method is divided into training stage and test stage. In the training stage, it is necessary to unify and gray-scale processed the pixel size of positive and negative samples set be collected in advance, and then it could obtain the feature vector with small dimension which can stably describe the UAV contour feature after the feature extraction is carried out by using HOG-FLD feature fusion method. Finally, the feature vector is input into the support vector machine model, the support vector machine classifier is trained after a series of statistical learning. In the test stage, the candidate regions are screened out by using the segmentation method based on graph theory and are extracted features according to the feature extraction method of the training stage, and then the extracted features are input into the trained classifier to carry out classification and detecting whether the candidate area includes the target UAV. Image segmentation uses the connected method of graph [6] or the minimum spanning tree to merge the region, which mainly determines whether to merge the two regions according to the similarity of each region of the image. This method could not only obtain the global features of the image and the candidate area of UAV, but also the calculation speed is very fast and the processing efficiency is also high. The features obtained by combining the HOG features of gradient histogram and the Fisher linear discriminant analysis (FLD) are selected as the features of the statistical learning training classifier in feature extraction stage. Because of the advantages that the features of small dimension and good robustness and good classification ability extracted by HOG-FLD feature fusion are rarely affected by the changes of local illumination, background change, target location and angle change, it is easier to train a classifier that is easy to classify and has strong generalization ability.
Image segmentation is a way to separate the image into regions with different characteristics and separate the interested objects from the background in the segmented blocks. Let interest objects and backgrounds in segmented regions show a clear sense of contrast based on human visual effects. Image segmentation method plays an important role in further analysis of image such as image compression and image recognition.
The image segmentation method based on graph theory (Graph-Based Image Segmentation) and the region merging method are selected to segment the image in this paper. The image segmentation method based on graph theory takes the image as the object transforming the image into the right undirected graph, and uses the connected branch method of graph [6] to segment the image, and then it could obtain the region block. The region merging method is to merge the segmented region blocks according to the specific merging rules. Because of great different characteristics of the UAV and the background in texture, color, size and degree of agreement, regional block of great different characteristics need to be decomposed and regional block of small different features need to be merged [11], which follows the principle of [10] the optimal segmentation.
The image contains information such as shape, size, color, texture and so on. Image will be segmented after the whole image is traversed by the search algorithm of graph theory. In the graph theory, the definition of the image - to - graph [12] is described as follows:
Where
The resulting dissimilarity between pixels within the segmented region could also be understood as the weight of the minimum edge that connects the vertices of two segmented regions, as defined below:
When there is no edge connection between the two partitioned regions, there is
If true in the expression is satisfied, then the boundary is represented, then the region is segmented. otherwise, it is merged.
Where the smallest internal difference of division is defined as follows:
Image Segmentation
Color [13] similarity
Where
Texture [13] similarity
Dimension similarity
Where
the similarity set
In order to judge whether the boundary condition of
The method could be divided into two stages from the perspective of practice: training and testing. In the training stage, the window with constant shape and size is used to traverse the whole UAV or non-UAV sample images, and the feature is extracted by HOG-FLD feature fusion method. A computable feature vector is obtained to describe the contour of the image, which is robust and easy to classify. The SVM classifier [8-9] for UAV detection could be obtained by statistical analysis and learning of these Eigenvectors. In the early stage of the test, the same method as the training stage is used to extract the feature of the regions to be detected, and the candidate targets of these areas are classified by the trained SVM classifier in the later stage to determine if the target in the candidate area is a UAV.
Because the edge contour feature of UAV has strong stability and extensibility, the HOG feature with strong ability to describe contour is extracted from data set. However, because of the large dimension of HOG feature vector, it is unfavorable to the training of classifier. In order to train the classifier with good classification ability, it is necessary to reduce the dimension of extracted HOG feature by FLD [18] feature fusion and finally obtain the feature vector with small dimension. To sum up, the feature extraction based on HOG-FLD feature fusion is adopted in this article.
The basis of feature extraction in this article is to extract feature of gradient histogram of oriented gradient (HOG). The thought of algorithm is as follows: calculating the gradient of selected image; The whole image is divided into rectangular cells of fixed size and equal size, each cell has pixel including
The steps of extracting HOG feature are as follows: graying the positive sample image of UAV, filtering the input image with Gamma correction method so that the image could achieve the standard contrast of color space and reduce the effects of local shadows and light changes on the image; dividing the image into a certain number of cells, and forming a fixed number of cells into a certain number of blocks of equal size as in the preceding paragraph, classifying the gradient range according to the above mentioned rules, we could calculate the cell features in these blocks and finally connect all the blocks together to obtain the feature vector containing the whole target UAV image. Formula (11) is a formula for normalizing the image, formula (12) and formula (13) could calculate the gradient component of each pixel, formula (14) and formula (15) could calculate the size and direction of the gradient.
Where
In this article, a window including 64 * 128 pixel is used to scan the sample image and the image to be detected using a window including 64 × 128 pixel. The scanning step size is 8 pixels (scanning is in horizontal and vertical direction). The window is divided into cells including 8 * 8 pixel and forms 8 * 16 = 128 units. Then setting up four adjacent units up and down to the left and right as a block of pixels, a window contains 105 blocks of pixels. A 3780 dimensional feature vector named HOG feature description value is generated in a window containing 105 pixel blocks according to the calculation steps of HOG. Its specific HOG algorithm is shown in Figure 2:
Principle Diagram of HOG Algorithm
On the basis of HOG feature, the linear subspace is constructed by using Fisher Linear Discriminant Analysis (FLD) [18]. By calculating the optimal projection matrix, the projection matrix for feature extraction of the training set is obtained, and the similarity of its projection vector is taken as the similarity degree
Where
In order to solve the problem that the intra-class dispersion matrix
Cosine similarity
Because the support vector machine (SVM) proposed by Vapnik has the advantages of simple system structure, global optimization, good generalization, and short training and prediction time [9], this paper uses SVM as a machine learning tool to calculate the rule of samples in order to achieve fast and efficient learning sample features and accurate classification purposes. The main idea of SVM is to deal with the linear inseparability of the original space by selecting the kernel function of Polynomial Kernel to correspond the data to the high-dimensional space. When the algorithm is used to realize the two-classification, the sample features such as HOG must be extracted from the original space first, and then the sample features in the original space are represented as a vector in the high-dimensional space. In order to minimize the error rate of the two class classification problems, we need to find a hyperplane that is used to divide the two classes in the high dimensional space.
Let the sample set be
The formula of the classification surface equation is as follows:
After normalization of the discriminant function, the following conditions must be satisfied for the two types of samples:
The classification interval could be
An SVM whose inner product function is maximum requirement
Its constraints are expressed as:
The formulas of the support vector machine that could be calculated are as follows:
Among them,
In this article, 900 positive sample images and 900 negative sample images are selected, and 500 original images of positive samples are selected to calculate the aspect ratio of UAV. An image that aspect ratio is 1:2 is obtained, and the pixels are normalized to 256 * 512 to avoid the effect of image size on the recognition effect. As described in the HOG feature section above, each image could produce 105 pixel blocks, and each pixel block could obtain a 36-dimensional feature vector, so a image will finally produce a 3780 dimensional HOG feature vector. The extracted HOG vector is used as the input vector of Fisher linear discriminant analysis algorithm to reduce the dimension of the whole vector. The dimension of the vector is adjustable and its parameter is determined according to the recognition efficiency of the experiment. Sending the vector reduced dimension to the SVM model, the SVM classifier is obtained to detect whether the UAV is included in the image under test.
The environment used in this experiment is that the simulation software of Matlab R2015b based Windows 7 on 64-bit operating system. Where CPU of computer is i5-6500 and its installed memory is 4.0 GB, and the number of the test images are 200. The best detection effect is obtained by adjusting the important parameters that affect the experimental results.
In the process of extracting feature vector, the dimension that needs to be reduced by Fisher linear discriminant analysis algorithm is the parameter k of FLD algorithm, which changes with the change of this parameter. The contrast graph of recognition time is shown in Figure 3. It can be seen from the graph that when k is 50, the whole algorithm has the least recognition time.
The time Chart with Parameter k Changed
Experiments show that in the whole algorithm, when the number of pixels in the divided blocks is 8*8, SVM kernel function type is 2, when the threshold of segmentation is k=90 and sigma=10, The algorithm has the best overall recognition effect on accuracy rate and time, and the recognition result is shown in Figure 4:
Identification Results
In order to verify the efficiency of the proposed method, the experimental results of the proposed method are compared with the experimental results of the HOG and SVM method based on image segmentation, and the experimental results are obtained by statistics. As shown in TABLE I, and the average recognition time of this method is faster than that of the latter method.
COMPARISON RESULTS OF THE TEST METHODS
The mechanism based on the region of interest to obtain the region of interest is used in this article. In the testing phase, the acquired regions of interest are input into the trained SVM classifier, which reduces the recognition time. The dimension reduction of Fisher linear discriminant analysis (FLD) makes SVM easy to train and in the phase of feature vector extraction. Compared the HOG-SVM detection method based on image segmentation with the region of interest based aerial UAV detection algorithm, this paper uses Matlab to carry out simulation experiments. The experimental results show that the detection algorithm based on region of interest is better than the sliding window method based on image segmentation to detect UAV in terms of accuracy and time.