Face recognition, expression identification, age determination, racial binding and gender classification are common examples of image processing computerization. Gender classification is very straightforward for us like we can tell by the person’s hair, nose, eyes, mouth and skin whether that person is a male or female with a relatively high degree of confidence and accuracy; however, can we program a computer to perform just as well at gender classification? The very problem is the main focus of this research. The conventional sequence for recent real-time facial image processing consists of five steps: face detection, noise removal, face alignment, feature representation and classification. With the aim of human gender classification, face alignment and feature vector extraction stages have been re-examined keeping in view the application of the system on smartphones. Face alignment has been made by spotting out 83 facial landmarks and 3-D facial model with the purpose of applying affine transformation. Furthermore, ‘feature representation’ is prepared through proposed modification in multilayer deep neural network, and hence we name it Deepgender. This convolutional deep neural network consists of some locally connected hidden layers without common weights of kernels as previously followed in legacy layered architecture. This specific case study involves deep learning as four convolutional layers, three max-pool layers (for downsizing of unrelated data), two fully connected layers (connection of outcome to all inputs) and a single layer of ‘multinomial logistic regression.’ Training has been made using CAS-PEAL and FEI which contain 99,594 face images of 1040 people and 2800 face images of 200 individuals, respectively. These images are either in different poses or taken under uncontrolled conditions which are close to real-time input facial image for gender classification application. The proposed system ‘Deepgender’ has registered 98% accuracy by combined use of both databases with the specific preprocess procedure, i.e., exhibiting alignment before resizing. Experiments suggest that accuracy is nearly 100% with frontal and nonblurred facial images. State-of-the-art steps have been taken to overcome memory and battery constraints in mobiles.