Research on robot scene recognition based on improved feature point matching algorithm

. A new feature description method based on the fusion of fast retina keypoint (FREAK) and the rotation-aware binary robust independent elementary features (rRBRIEF) is proposed to realize the effective combination of efficiency and accuracy of the two feature descriptions. In addition, in the elimination stage of mismatched point pairs, by setting the base point and its neighborhood, an improved neighborhood parallel random sample consensus (RANSAC) algorithm is proposed to achieve efficient parallel operation of the algorithm in multiple local neighborhoods. The improved feature point matching algorithm and the existing algorithm were tested in different scales, different rotations, different illuminations, and different fuzzy data sets. The experimental results show that the improved algorithm improves the average scene recognition accuracy by 18.21%, improves the efficiency by 15.58%, and shows good robustness.


Introduction
Visual Simultaneous Localization Mapping (VSLAM) technology based on vision has developed rapidly.The robot scene recognition technology based on feature point matching is widely used in the visual front end of VSLAM technology to complete the initial recognition and matching of the mobile robot to the scene.The commonly used feature point matching algorithm is ORB (Oriented FAST and Rotated BRIEF) algorithm, which meets the real-time requirements of mobile robot synchronous positioning and mapping.It is often divided into four stages including image feature point extraction, feature description calculation, feature matching and mismatch point pair elimination.The SLAM framework based on ORB feature matching algorithm has good performance in many VSLAM frameworks.Raúl and Tardós proposed the ORB-SLAM2 [1] system framework in 2015.The framework homogenizes the distribution of feature points by quadtree decomposition on the basis of ORB algorithm.Their fellow team member Campos C proposed ORB-SLAM3 [2] system framework on the basis of ORB-SLAM2 system framework in July 2020.
ORB algorithm has high real-time performance, but poor robustness and matching recognition accuracy.An improved ORB feature extraction algorithm based on local region adaptive threshold was proposed [3].Zhang et al. proposed an image matching algorithm based on the combination of ORB and clustering algorithm [4].Wang et al. proposed a feature description algorithm for extracting feature points under different affine transformations [5].Secondly, the ORB feature matching algorithm uses the violent matching algorithm in the matching stage.Violent matching is bound to have some mistakes, and the RANSAC algorithm is commonly used for eliminating mismatch point pairs.RANSAC algorithm has some limitations in accuracy and efficiency.In order to solve its limitations, Barath et al. proposed to optimize and improve the interior point judgment method of RANSCA with graph cutting algorithm [6].Chum et al. proposed the progressive sampling consensus (PROSAC) [7].Chum et al. proposed a locally optimized RANSAC algorithm [8].
Inspired by the above researches, aiming at the low recognition accuracy and efficiency of ORB feature matching algorithm in the process of robot scene recognition, two ideas are proposed to improve the original algorithm in the stage of feature description calculation and error matching point elimination, in order to improve the feature recognition accuracy and efficiency of feature point matching algorithm in the process of robot scene recognition.

FREAK-rBRIEF feature description
The expressions of FREAK and rBRIEF are the same.After analysis, a FREAK-rBRIEF feature description based on the fusion of FREAK and rBRIEF is proposed to realize the effective combination of the accuracy and efficiency of the two feature descriptions.
Through the analysis of FREAK feature description, it is found that FREAK feature description is a binary coding feature description similar to rBRIEF feature description.It not only reduces the demand for storage space, improves the generating speed of feature description, but also reduces the time required for feature matching.FREAK can screen 90% of irrelevant candidate points through the coarse information of the first 16 bytes, and runs in parallel, with higher accuracy and efficiency than rBRIEF.However, FREAK has poor robustness to scale changes.When the image scale changes greatly, FREAK has a significant impact on the Fovea fine information region, resulting in a large number of mismatches in the matching stage.By analyzing the advantages and disadvantages of the two feature descriptions, the first 128 bits coarse precision feature description of FREAK is used to replace the first 128 bits of rBRIEF, and a new FREAK-rBRIEF feature description is formed after fusion.Based on the FREAK-rBRIEF feature description, cascade violence matching is carried out.Firstly, the first 128 bits of FREAK are used for coarse matching, and the candidate points are quickly selected, for what is lower than the threshold, the rBRIEF with scale invariance is used for accurate matching to realize the combination of accuracy and efficiency of the two feature descriptions, so as to improve the recognition accuracy and efficiency of the matching algorithm based on this feature description in the process of robot scene recognition.
outliers by continuous value iteration, and filter out the outliers in the sample dataset.It is commonly used in feature matching algorithm to remove mismatched feature points.

Selection of base point
Firstly, the ratio-test algorithms used to assign a confidence score to the unpurified feature matching points.The violent matching process of the original feature points is based on finding the minimum Hamming distance of the feature description as the final matching result, and the confidence score is defined as follows.
Where, S represents the confidence score, , indicates that the robustness of the matching point pair is higher.Local non-maximum suppression in a circle with radius R is applied to a matching point with lower confidence.If the point is the maximum, the point is defined as the base point, and all matching points satisfying this condition are called the base point sets.

Selection and filtering of local neighborhood of base point
After determining the base point set, find matching pairs that support base point matching, that is, select the matching point in the same area as the base point and meet certain similar constraints in the initial raw matching pairs, and set i S as the ith base point matching relationship, which is defined as follows Where,  1   and 2   represent the ith base point matching pairs of the two images that need to be matched.They conform to similar transformations (rotation + scaling), in which the rotation can be shown as Where, M represents the initial raw matching point pairs, d represents the feature description.Then two constraints are set according to the relative position relationship with the base point and the scale consistency of the matching point, which is defined as follows: Where, 1 R and 2 R represent the base point diffusion radii of the image.t α and t σ represents the confidence threshold of rotation and scaling. =  2 −  1 and  =  2 / 1 represent the angle and scale difference between two matching points.Formula ( 4) is used to constrain the relative position relationship between the initial matching point and the base ITM Web of Conferences 47, 02028 (2022) CCCAR2022 https://doi.org/10.1051/itmconf/20224702028point.Formula ( 5) is used to constrain the angle and scale consistency of matching points.
If the matching points near the base point meet the two constraints of formula (4) and formula (5), they can be incorporated into the base point matching neighborhood  ⊆ .
Finally, the RANSAC algorithm with the average optimal fixed iteration number is run in parallel in each neighborhood to complete the elimination of false matching points in the global range.Compared with the original RANSAC algorithm, the neighborhood parallel RANSAC algorithm reduces the number of iterations, increases the number of interior points, avoids the phenomenon of high locality, obtains more accurate transformation matrix, and improves the recognition efficiency and accuracy of matching algorithm in the process of robot scene recognition.

The overall flow of improving the feature matching algorithm
The basic flow of the improved algorithm, which is mainly divided into feature point detection and screening, calculation and fusion of feature description, matching and improvement of feature point, and algorithm of eliminating mismatch point pairs.

Experiment and result analysis
In order to verify the performance of the improved feature matching algorithm in robot scene recognition, the improved feature matching algorithm is compared with the ORB+RANSAC algorithm which is more traditional but increases the scale pyramid.In order to ensure the objectivity of the experiment, different scenes are selected from representative data sets for verification, so as to make the images and scenes closer to reality.The transformation of different scales and perspectives can best reflect the performance of the algorithm.Therefore, the graf sequence of Mikolajczyk dataset and the images with different scales of oxbuild dataset are used for two groups of contrast experiments under different scales.Then, oxbuild dataset is used for a single group of contrast experiments under different rotation conditions, different light and dark conditions and different fuzzy conditions, a total of five groups.

Output diagram of feature matching recognition effect
It can be seen from the matching recognition comparison diagram of the experimental output that the improved feature point matching algorithm has better robustness and recognition accuracy than ORB + RANSAC under different scale changes, different illumination changes, different rotation changes and different fuzzy changes.

Comparative analysis of experimental data
The experimental results are analyzed from two aspects of matching recognition accuracy and efficiency.The data are the average value of 10 experimental test results.

Efficiency analysis of matching recognition
The feature matching algorithm was improved in the two stages of feature description calculation and error matching point elimination, respectively.It is difficult to measure the efficiency of the algorithm with single-stage time.Therefore, the overall running time of the algorithm is used to verify the efficiency of the algorithm, which is the total time consumed by the algorithm from the beginning to the end.In order to intuitively represent the difference between the running time of the improved algorithm and the ORB+RANSAC algorithm, the data in the table is represented in the form of a line graph, as shown in Figure 1.By analyzing table 1 and fig. 1, it is concluded that the running time of the two algorithms has the same trend under the conditions of different scale changes, different illumination changes, different rotation changes and different fuzzy changes.However, compared with ORB+RANSAC, the average running time of the improved feature point matching algorithm is reduced by 12.96 ms, and the overall matching recognition efficiency is increased by 15.58 %.Although the improved algorithm needs to calculate two kinds of feature description in the calculation feature description stage, which is more time-consuming than ORB + RANSAC, but it takes less time in the matching and mismatch point elimination stages, so the scene matching recognition efficiency is increased by 15.58 %, which has better applicability in robot scene recognition.

Analysis of matching recognition accuracy
The matching recognition accuracy is represented by the total matching pair in the correct matching comparison.In order to intuitively represent the difference of matching recognition accuracy between the improved algorithm and ORB + RANSAC algorithm, the data in the table is represented in the form of a line graph, as shown in Figure 2. By analyzing Figure 2, it is concluded that under the conditions of different scale changes, different light changes, different rotation changes and different fuzzy changes, the average matching recognition accuracy of the feature point matching algorithm is improved by 18.21% higher than that of the ORB+RANSAC algorithm, and the improved algorithm has higher robustness.The matching recognition accuracy is above 90 % in different scenes, and it has better applicability in robot scene recognition.

Conclusion
In view of the fact that the traditional feature point matching algorithm cannot meet the requirements of high precision and high efficiency of robot scene recognition technology, the free rBRIEF feature description and neighborhood parallel RANSAC algorithm are proposed.Compared with the widely used ORB+RANSAC feature point matching algorithm, the improved algorithm shows better robustness, matching recognition efficiency and accuracy IN different scene experimental tests, and has better applicability in robot scene recognition technology, especially in mobile robot VSLAM technology based on feature point method.
min d represents the minimum feature description distance, and min s d represents a minor feature description distance.The smaller the S, the greater the difference between min d and min s d
=  2   −  1   and scaling can be shown as   =  2   / 1 .So for any matching pair, it meets the following formula: