Abstract
Abstract. Deep Learning is a kind of Machine Learning technology which utilizing the deep neural network to learn a promising model from a large training data set. Convolutional Neural Network (CNN) has been successfully applied in image segmentation and classification with high accuracy results. The CNN applies multiple kernels (also called filters) to extract image features via image convolution. It is able to determine multiscale features through the multiple layers of convolution and pooling processes. The variety of training data plays an important role to determine a reliable CNN model. The benchmarking training data for road mark extraction is mainly focused on close-range imagery because it is easier to obtain a close-range image rather than an airborne image. For example, KITTI Vision Benchmark Suite. This study aims to transfer the road mark training data from mobile lidar system to aerial orthoimage in Fully Convolutional Networks (FCN). The transformation of the training data from ground-based system to airborne system may reduce the effort of producing a large training data set.This study uses FCN technology and aerial orthoimage to localize road marks on the road regions. The road regions are first extracted from 2-D large-scale vector map. The input aerial orthoimage is 10 cm spatial resolution and the non-road regions are masked out before the road mark localization. The training data are road mark’s polygons, which are originally digitized from ground-based mobile lidar and prepared for the road mark extraction using mobile mapping system. This study reuses these training data and applies them for the road mark extraction using aerial orthoimage. The digitized training road marks are then transformed to road polygon based on mapping coordinates. As the detail of ground-based lidar is much better than the airborne system, the partially occulted parking lot in aerial orthoimage can also be obtained from the ground-based system. The labels (also called annotations) for FCN include road region, non-regions and road mark. The size of a training batch is 500 pixel by 500 pixel (50 m by 50 m on the ground), and the total number of training batches for training is 75 batches. After the FCN training stage, an independent aerial orthoimage (Figure 1a) is applied to predict the road marks. The results of FCN provide initial regions for road marks (Figure 1b). Usually, road marks show higher reflectance than road asphalts. Therefore, this study uses this characteristic to refine the road marks (Figure 1c) by a binary classification inside the initial road mark’s region.To compare the automatically extracted road marks (Figure 1c) and manually digitized road marks (Figure 1d), most road marks can be extracted using the training set from ground-based system. This study also selects an area of 600 m × 200 m in quantitative analysis. Among the 371 reference road marks, 332 can be extracted from proposed scheme, and the completeness reached 89%. The preliminary experiment demonstrated that most road marks can be successfully extracted by the proposed scheme. Therefore, the training data from the ground-based mapping system can be utilized in airborne orthoimage in similar spatial resolution.
Highlights
To compare the automatically extracted road marks (Figure 1c) and manually digitized road marks (Figure 1d), most road marks can be extracted using the training set from ground-based system
Deep Learning is a kind of Machine Learning technology which utilizing the deep neural network to learn a promising model from a large training data set
The benchmarking training data for road mark extraction is mainly focused on close-range imagery because it is easier to obtain a close-range image rather than an airborne image
Summary
To compare the automatically extracted road marks (Figure 1c) and manually digitized road marks (Figure 1d), most road marks can be extracted using the training set from ground-based system. Abstract: Deep Learning is a kind of Machine Learning technology which utilizing the deep neural network to learn a promising model from a large training data set. Convolutional Neural Network (CNN) has been successfully applied in image segmentation and classification with high accuracy results. The CNN applies multiple kernels ( called filters) to extract image features via image convolution.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.