In recent years, a great deal of research has been done on the identification, monitoring, and classification of human activities. Human activity recognition (HAR) is a term commonly used to describe the automatic identification of physical activities. For activity recognition, there are primarily vision-based and sensor-based methods available. The computer vision-based method is generally effective in lab settings, but because of clutter, fluctuating light levels, and contrast, it may not perform well in real-world scenarios. Continuous monitoring and analysis of physiological signals obtained from heterogeneous sensors attached to an individual’s body is required to realise sensor-based HAR systems. Most of the previous research in human activity recognition (HAR) is biased along with feature engineering and pre-processing which requires a good amount of domain knowledge. Application-specific modelling and time-taking methods are involved in these approaches. In this work, the multi-head convolutional neural network-based human activity recognition framework is proposed where automatic feature extraction and classification are involved in the form of an end-to-end classification approach. Experiments of this approach are performed by taking raw wearable sensor data with few pre-processing steps and without the involvement of a handcrafted feature extraction technique. 99.23% and 93.55% accuracy are obtained on the WISDM and UCI-HAR datasets which denoted the much improvement in the assessment of HAR over other similar approaches. The model is also tested on locally collected data from a chest mounted belt with fabric sensors and an accuracy of 87.14% has been achieved on that data.