Abstract

Personality distinguishes individuals’ patterns of feeling, thinking, and behaving. Predicting personality from small video series is an exciting research area in computer vision. The majority of the existing research concludes preliminary results to get immense knowledge from visual and Audio (sound) modality. To overcome the deficiency, we proposed the Deep Bimodal Fusion (DBF) approach to predict five traits of personality-agreeableness, extraversion, openness, conscientiousness and neuroticism. In the proposed framework, regarding visual modality, the modified convolution neural networks (CNN), more specifically Descriptor Aggregator Model (DAN) are used to attain significant visual modality. The proposed model extracts audio representations for greater efficiency to construct the long short-term memory (LSTM) for the audio modality. Moreover, employing modality-based neural networks allows this framework to independently determine the traits before combining them with weighted fusion to achieve a conclusive prediction of the given traits. The proposed approach attains the optimal mean accuracy score, which is 0.9183. It is achieved based on the average of five personality traits and is thus better than previously proposed frameworks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.