Pedestrian classification is of increased interest to autonomous transportation systems due to the development of deep convolutional neural networks. Despite recent progress on pedestrian classification, it is still challenging to identify individuals who are partially occluded because of the diversity of the occluded parts, variation in pose, and appearance. This causes a significant performance reduction when pedestrians are covered by other objects, and feature information is lost due to the occluded parts. To solve this problem, we propose two network architectures using tree structure convolutional neural networks (T-CNN). They use the structural representation of multi-branch deep convolutional features, with the advantages of its end-to-end learning process. The high-level tree structure CNN (HT-CNN) architecture aims to concatenate the output of the classification layer from multi-segmented patches of pedestrians to handle partially occluded problems. The low-level tree structure CNN (LT-CNN) concatenates the discriminative features from each multi-segmented patch and global features. Our T-CNN architecture with a high-level tree structure performed with 94.64% accuracy on the INRIA dataset without occlusions, and with 70.78% accuracy on the Prince of Songkla University (PSU) dataset with occlusions, outperforming a baseline CNN architecture. This indicates that our proposed architecture can be used in a real-world environment to classify the occluded part of pedestrians with the visual information of multi-segmented patches using tree-structured multi-branched CNN.
Read full abstract