Caltech Dataset Research Articles

Convolutional neural networks (CNNs) have shown excellent performance for vision-based lane detection. However, maintaining the performance of the trained models under new test scenarios still remains challenging due to the dataset bias between the training and test datasets; In lane detection processes, the dataset bias can be categorized into lane position bias and lane pattern bias, with the former one particularly influences the lane detection performance. To tackle this dataset bias, this article proposes a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">unified viewpoint transformation (UVT)</i> method that transforms the camera viewpoints of different datasets into a common virtual world coordinate system, such that the mismatched lane position distributions can be effectively aligned. Experiments are conducted on multiple datasets including the Caltech <xref ref-type="bibr" rid="ref1" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[1]</xref> , Tusimple <xref ref-type="bibr" rid="ref2" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[2]</xref> , and KITTI <xref ref-type="bibr" rid="ref3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">[3]</xref> dataset. The results demonstrate the effectiveness of the UVT algorithm in improving the lane detection performance on the test datasets. Moreover, by incorporating the UVT into other techniques that tackling the dataset bias, the lane position and pattern differences are disentangled and separately minimized. As a result, the performance gap between the training data and the test scenarios can be bridged. Specifically, the model trained on the KITTI dataset have achieved high performance in the Tusimple and the Caltech dataset (F1-score: 84.8 and 87.1%). With the proposed algorithm, a lane detection model trained on one dataset can be effectively applied to datasets with different camera settings in vastly different localities, and achieve better generalization ability compared to the state of the art methods.

Read full abstract

Bag-of-Visual-Words (BoVW) is still a useful image classification model when there is not enough data to use Deep Learning. In BoVW model, the practice of reducing the reconstruction errors of local features can improve the classification accuracy owing to the decrease of information loss. Many reconstruction-based coding methods are proposed to learn a visual dictionary and encode local features via minimizing the reconstruction errors of local features with constraints. Besides this, the accuracy can also be improved by learning the category-specific dictionaries and then encoding features based on these dictionaries. By considering the two practices together, we propose a simple category-specific dictionary learning method tailored for reconstruction-based feature coding. Our method can be used as a universal one to improve the classification accuracies of many reconstruction-based coding methods, which is the highlight of our method. Concretely, a universal dictionary is learned by employing a reconstruction-based coding method and then refined for each category to obtain the category-specific dictionary of this category. When encoding a feature by a category-specific dictionary, the visual words for encoding it are decided in advance by the indices, which correspond to the non-zero elements of its coding vector obtained with the universal dictionary. The effectiveness of our method is validated by observing whether there is an accuracy improvement after applying our method. Our results on Scene-15, Caltech-101, and UIUC-Sports datasets show that the accuracies of four representative coding methods are improved by about 0.3% to 2.7%, which experimentally demonstrates the universality and effectiveness of our method.

Read full abstract

Caltech Dataset Research Articles

Related Topics

Articles published on Caltech Dataset

Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition.

Deep Fuzzy Multi-Object Categorization in Scene

Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks

A Unified Multi-Task Learning Architecture for Fast and Accurate Pedestrian Detection

A NOVEL SELF-TAUGHT LEARNING FRAMEWORK USING SPATIAL PYRAMID MATCHING FOR SCENE CLASSIFICATION

Randomized non-linear PCA networks

Associative memory optimized method on deep neural networks for image classification

Bridging the Gap of Lane Detection Performance Between Different Datasets: Unified Viewpoint Transformation

An efficient content based image retrieval using enhanced multi-trend structure descriptor

A hybrid codebook model for object categorization using two-way clustering based codebook generation method

Multi-scale feature network for few-shot learning

A Study of Dimensionality Reduction Impact on an Approach to People Detection in Gigapixel Images

Automating Configuration of Convolutional Neural Network Hyperparameters Using Genetic Algorithm

A Novel Image Retrieval Method with Improved DCNN and Hash

Image Retrieval Based on Deep Feature Extraction and Reduction with Improved CNN and PCA

A Category-Specific Dictionary Learning Method Tailored for Reconstruction-Based Feature Coding

Multiple Kernel Based Transfer Learning for the Few-Shot Recognition Task in Smart Home Scene

Применение метода мажоризации-минимизации к алгоритму Чана --- Везе в задаче сегментации изображений

Learning Semantic Text Features for Web Text-Aided Image Classification

Image Feature Selection using Ant Colony Optimization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Caltech Dataset Research Articles

Related Topics

Articles published on Caltech Dataset

Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition.

Deep Fuzzy Multi-Object Categorization in Scene

Color Channel Perturbation Attacks for Fooling Convolutional Neural Networks and A Defense Against Such Attacks

A Unified Multi-Task Learning Architecture for Fast and Accurate Pedestrian Detection

A NOVEL SELF-TAUGHT LEARNING FRAMEWORK USING SPATIAL PYRAMID MATCHING FOR SCENE CLASSIFICATION

Randomized non-linear PCA networks

Associative memory optimized method on deep neural networks for image classification

Bridging the Gap of Lane Detection Performance Between Different Datasets: Unified Viewpoint Transformation

An efficient content based image retrieval using enhanced multi-trend structure descriptor

A hybrid codebook model for object categorization using two-way clustering based codebook generation method

Multi-scale feature network for few-shot learning

A Study of Dimensionality Reduction Impact on an Approach to People Detection in Gigapixel Images

Automating Configuration of Convolutional Neural Network Hyperparameters Using Genetic Algorithm

A Novel Image Retrieval Method with Improved DCNN and Hash

Image Retrieval Based on Deep Feature Extraction and Reduction with Improved CNN and PCA

A Category-Specific Dictionary Learning Method Tailored for Reconstruction-Based Feature Coding

Multiple Kernel Based Transfer Learning for the Few-Shot Recognition Task in Smart Home Scene

Применение метода мажоризации-минимизации к алгоритму Чана --- Везе в задаче сегментации изображений

Learning Semantic Text Features for Web Text-Aided Image Classification

Image Feature Selection using Ant Colony Optimization