Abstract

Convolutional neural network (CNN), one of the most commonly used deep learning methods, has been applied to various computer vision and pattern recognition tasks, and has achieved state-of-the-art performance. Most recent research work on CNN focuses on the innovations of the structure. This thesis explores both the innovation of structure and final label encoding of CNN. To evaluate the performance of our proposed network structure and label encoding method, two computer vision tasks are conducted, namely age estimation from facial image and depth estimation from a single image. For age estimation from facial image, we propose a novel hierarchical aggregation based deep network to learn aging features from facial images and apply our encoding method to transfer the discrete aging labels into a possibility label, which enables the CNN to conduct a classification task rather than regression task. In contrast to traditional aging features, where identical filter is applied to the entire facial image, our deep aging feature can capture both local and global cues in aging. Under our formulation, convolutional neural network (CNN) is employed to extract region specific features at lower layers. Then, low layer features are hierarchically aggregated by using fully connected way to consecutive higher layers. The resultant aging feature is of dimensionality 110, which achieves both good discriminative ability and efficiency. Experimental results of age prediction on the MORPH-II and the FG-NET databases show that the proposed deep aging feature outperforms state-of-the-art aging features by a margin. Depth estimation from a single image is an essential component toward understanding the 3D geometry of a scene. Compared with depth estimation from stereo images, depth map estimation from a single image is an extremely challenging task. This thesis addresses this task by regression with deep features, combined with surface normal constrained depth refinement. The proposed framework consists of two steps. First, we implement a convolutional neural network (CNN) to learn the mapping from multi-scale image patches to depth on the super-pixel level. In this step, we apply the proposed label encoding method to transfer the contin-

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call