Abstract
Recent computational studies have emphasized layer-wise quantitative similarity between convolutional neural networks (CNNs) and the primate visual ventral stream. However, whether such similarity holds for the face-selective areas, a subsystem of the higher visual cortex, is not clear. Here, we extensively investigate whether CNNs exhibit tuning properties as previously observed in different macaque face areas. While simulating four past experiments on a variety of CNN models, we sought for the model layer that quantitatively matches the multiple tuning properties of each face area. Our results show that higher model layers explain reasonably well the properties of anterior areas, while no layer simultaneously explains the properties of middle areas, consistently across the model variation. Thus, some similarity may exist between CNNs and the primate face-processing system in the near-goal representation, but much less clearly in the intermediate stages, thus requiring alternative modeling such as non-layer-wise correspondence or different computational principles.
Highlights
We examined three publicly available pre-trained networks: (1) VGG-Face network[21], a very deep 16-layer CNN model trained on face images, (2) AlexNet[19], trained on general natural images, and (3) Oxford-102 network, an AlexNet-type model trained on flower images (“Methods”)
Discussions In this study, we have investigated whether CNN can serve as a model of the macaque face-processing network
One study[18] found AM-like shapeappearance tuning in the top layer of a face-classifying CNN model, to our results (Fig. 4b and Supplementary Fig. 5, layer 7). Another study[22] tested view-identity tuning on several CNN models to compare with their novel generative model. They showed response similarity matrix (RSM) results similar to ours (Fig. 2), they incorporated a more sophisticated quantitative comparison with experimental data[15] and thereby revealed notable similarity in the later stage and dissimilarity in the intermediate stage, which is generally compatible with our conclusion
Summary
To investigate whether CNN can explain known tuning properties of the macaque face-processing network, we started with a representative CNN model optimized for classification of face images. We ran the protocols (stimulus set and data analysis) of previous four monkey experiments[15,16,17,18] on our model and thereby investigated whether each model layer replicated similar population-level tuning properties to the corresponding published experimental data (Fig. 1). Note that, in this approach, we need no raw experimental data. In the second experimental study[18], coding of facial shapes and appearances in the macaque face patches (ML and AM) has been investigated To their method, we constructed a face space based on the active appearance model[20].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.