OALibJ  Vol.8 No.12 , December 2021
Nature Scene Signs Recognition by Affine Scale-Invariant Feature Transform
Abstract: In this paper, it is presented an approach to automatic detection and recognition of signs from natural scenes, and its application to a sign recognition. The natural scene signs recognition is affected by the light changing, fading, being distorted on the surface of t signs. In this paper, a method is proposed that combined the improved scale invariant feature transform (SIFT) algorithm which reduces the high dimension and high complexity with Affine Scale Invariant Feature Transformation (ASIFT) algorithm under the natural scene sign detection and recognition. Experiments have proved that the method improved match number and matching time.

1. Introduction

Automatic recognition of traffic signs on the road can prevent the traffic accidents, and it is also important to The Smart Car on safe driving. As traffic signs usually consist of specific shapes (circles, squares, and triangles) and colors (red, blue, and yellow), which have significant visual effects in road environments, traffic sign detection methods can be divided into color-based, shape-based, and color-based methods [1]. However, given traffic signs cannot be effectively recognized while the color, scale, rotation, and light conditions are changed.

Recently, feature-based method has been considered to be developed into the method based on local invariant features. These methods of local features is to use of the constantness of the image rotation, translation and scaling. Currently, the most advanced method is based on local invariant features SIFT, Harris-Affine, Hessian-Affine and MSER [2] [3] [5] [6] [7]. And SIFT is better than all other methods [8], but only four parameters are invariant for a 6 parameter affine transformation [2]. These algorithms achieved good results in a limited range, but are prone to a large error rate. In addition, some problems of SIFT algorithm result in too long time-consuming in the detection and recognition of road signs and make real-time great impact. This paper combines improved SIFT algorithm [4] with ASIFT algorithm [5] [9] to solve effectively recognition while the color, scale, rotation, and light conditions are varied.

2. Algorithm Frame Work Based on ASIFT and Improved SIFT

This paper proposes a traffic signs recognition algorithm that fused ASIFT and improved SIFT. Simply combined ASIFT with SIFT, although this can ensure accuracy, the speed of detection is slower. Fusing ASIFT and Improved SIFT, both can ensure accuracy and relatively increase the detection rate of speed, making good balance between those. The framework of the algorithm is shown in Figure 1.

In the first stage, construct affine region by simulating all images in different perspectives (scale λ, the rotation latitude angle ψ and longitude angle Φ). The next stage, detect the key point using improved SIFT. Each key point can be described by circle SIFT feature descriptor-vector 1 × 64. The next stage, match candidate key point by calculating Euclidean distance. In forth stage we will

Figure 1. Framework of the algorithm.

focus on the recognition of sign by choosing match numbers more than 3. These stages will be repeated again until no sign occurs.

3. ASIFT Algorithm

ASIFT invariant region-wide radiation detection, this approach comes from the SIFT detection, in the addition of two new camera optical axis parameters based on SIFT detection, the rotation latitude angle ψ and longitude angle Φ, can simulate all images in different perspectives. The whole process is as follows: 1) Each image is transformed by simulating all possible affine distortions caused by the change of camera optical axis orientation from a frontal position. For example, the images undergo Φ-rotations followed by tilts and ψ -rotations. 2) These rotations and tilts are performed for a finite and small number of latitude and longitude angles. 3) All simulated images are compared by a similarity invariant matching algorithm SIFT. 4) After above three steps, the original query and search images are yielded. Based on it, the affine transforms SIFT is selected to yield matches. All radiation transformation model can be seen in Figure 2.

Figure 2 shows the distortion model of the local image in conditions of the camera from the u(x, y) to the Au(x, y) (the decomposition of A).

4. Improved SIFT Algorithm

During the process of ASIFT algorithm computing, using SIFT algorithm in the input image for the detection of key points. The algorithm mainly consists of four steps [3] (see Figure 3).

The first stage, assign the direction value to the key point, is to ensure the rotation invariance of descriptors. If the descriptor itself has a very good anti-rotation capability, then third stage, determine the direction of key points can be omitted. Taking into account the image is rotated, the region around feature points will change, and circle with good rotation invariance, so this paper use circle construct SIFT feature descriptor.

The new descriptor itself has anti-rotation capability, do not rotate the local

Figure 2. Plane view of the positive image (ψ0 = Φ0 = 0, λ0 = 1).

Figure 3. SIFT algorithm block diagram.

area to obtain rotation invariant of the descriptors, and decreased the dimension of the original feature describes vector from 1 × 128 to 1 × 64, further reducing algorithm complexity.

5. Experimental Result and Analysis

In the experiment, we find candidate matching features by Euclidean distance function. When ASIFT feature vectors of traffic signs images captured in a natural scene are generated, using the Euclidean distance of key point of feature vectors as the similarity determines measure of the key points in the two images. Taken in a critical point in a picture, and find out first two key points which are shortest Euclidean distance with image B by traversing search, in these two key points, if the nearest distance divided by the near distance is small to a certain threshold, the match successfully. That the probability of right matching is decided by the number of the nearest distance divided by the near distance (dim/dim-1). Euclidean distance is expressed as follows:

d ( p , q ) = i = 1 n ( p i q i ) 2 (1)

where, p = (p1, p2, ・・・, pn) and q = (q1, q2, ・・・, qn) is the two European 128 dimensional description of vector points.

Decrease the proportion dim/dim-1, the number of match points will reduce, but more stable. Here in all the matches at a ratio of distance is greater than 0.8 was rejected. Although this step is loss 5% of the correct match, it can eliminate 90% of the errors match, which is large meaning to match stability.

Finally, the selection process is the least square method based on the candidate matching points. Adjacent pixels are already established after ASIFT feature matching pretreatment and have a small number of connection points. If the projection of a key point through these parameters lies within half the error range that was used for the parameters in the Hough transform bins, the key point match is voted.

If fewer than 3 points remain after discarding outliers for a bin, then the object match is rejected. The least-squares fitting is repeated until no more rejections take place. At least 3 votes are identified as the presence of an object.

Figure 4. ASIFT and SIFT correct matches. (a) Improved ASIFT algorithm (left) and SIFT algorithm (right), different scale, φ = 45˚, T = 1.4 (θ = 45˚). (b) Improved ASIFT algorithm (left) and SIFT algorithm (right), different view φ = 0˚, T = 5.8 (θ = 80˚). (c) Improved ASIFT algorithm (left) and SIFT algorithm (right), different illumination: night time with good light and afternoon, additional different view φ = 0˚, T = 5.8 (θ = 80˚). (d) Improved ASIFT algorithm (left) and SIFT algorithm (right), fliper view φ = 0˚, T = −1 (θ = 180˚).

Table 1. Matching number of SIFT and ASIFT.

Table 2. Running time of SIFT and ASIFT.

The decision of whether to cluster a new training image with an existing model view is based on comparing e to a threshold T, T = 0.05 × size.

To illustrate the superiority of the algorithm, we compared ASIFT algorithm with improved SIFT, and with normal SIFT, also with the SIFT algorithm separately.

In this case, about 12 to 20 images were taken of each object around at least a hemisphere of viewing directions, such as tilt varying from 1 to 5.8, namely latitude angle from 0˚ to 80˚, longitude angle from 0˚ to 180˚.

The experiment using the data under various environmental conditions shows the validity of the proposed technique. The signs for recognition are ACROSSING, BIKEWAY, MOTORWAY, PARKING and so on. These illustrations image is used for the template image. The size of these images is 640 × 480. These data are obtained in good weather conditions during the day and night. Figure 4 shows the image under the SIFT algorithm and ASIFT algorithm in several location of shots.

From Table 1 matching number of ASIFT algorithm is more than three times that of SIFT algorithms.

From Table 2 matching time is decreased from 36.32% to 41.65% in improved ASIFT algorithm. Also efficiency is improved.

6. Conclusions

From the experiments above and many other experiments, the following conclusions can be obtained.

1) When the maximum tilting value tmax = 5.8, ASIFT algorithm is better than the other algorithms in matching.

2) Improved ASIFT algorithm under premise of ensuring its performance can improve the operation speed.

3) ASIFT algorithm was better than other algorithms in correct matching, and improved ASIFT algorithm has more real time. High recognition performance is obtained in a real environment, thus clear that, in the actual environment, this approach is effective.


This research work was supported by Guangxi Key Laboratory Fund of Embedded Technology and Intelligent System (Guilin University of Technology) under Grant No.2020-2-6.

Cite this paper: Li, X., Song, L.G. and Sun, Y.Q. (2021) Nature Scene Signs Recognition by Affine Scale-Invariant Feature Transform. Open Access Library Journal, 8, 1-7. doi: 10.4236/oalib.1108225.

[1]   Madani, A. and Yusof, R. (2017) Traffic Sign Recognition Based on Color, Shape, and Pictogram Classification Using Support Vector. Neural Computing & Applications, 30, 2807-2817.

[2]   Alexeev, B.G. (2021) Nonlocal Physics in the Wave Function Terminology. Journal of Applied Mathematics and Physics, 9, 2889-2908.

[3]   Lowe, D.G. (2005) Distinctive Image Features from Scale-Invariant Keypoint. International Journal of Computer Vision, 60, 91-110.

[4]   Zhang, C., Gong, Z. and Sun, L. (2008) Application of Improved SIFT Features in Image Matching. Computer Engineering and Applications, 44, 95-97.

[5]   Morel, J.-M. and Yu , G. (2009) A New Framework for Fully Affine Invariant Image Comparison. SIAM Journal on Imaging Sciences, 2, 438-469.

[6]   Lowe, D.G. (1999) Object Recognition from Local Scale Invariant Features. International Conference on Computer Vision, Corfu, September 1999, 1150-1157.

[7]   Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T. and Van Gool, L. (2005) A Comparison of Affine Region Detectors. International Journal of Computer Vision, 65, 43-72.

[8]   Takaki, M. and Fujiyoshi, H. (2009) Traffic Sign Recognition Using SIFT Features. IEEJ Transactions on Electronics Information and Systems, 129, 824-831.

[9]   Li, X., Uchimura, K. and Hu, Z. (2009) Traffic Sign Recognition Using Affine Scale-Invariant Feature Transform. IEICE Technical Report, ITS 2009-12.