This paper highlights the shift from classical machine learning to deep learning models and describes the fundamental methodology and developments in pedestrian activity recognition. The four main steps of the workflow are the gathering of datasets, pre-processing, designing and training the model, and evaluating the outcome. To extract pedestrian feature vectors, data must first be collected, cleaned, and processed from public or proprietary datasets. These vectors are used to train deep learning or machine learning models, which are subsequently assessed and fine-tuned for use in practical applications such as behaviour analysis and surveillance. For action recognition, conventional machine learning techniques like Random Forests (RF) and Support Vector Machines (SVM) have been used. SVMs, despite their potential for computing complexity, identify the best hyperplanes for classification. The categorization rates for a variety of human behaviours have been enhanced by a combination strategy utilizing SVMs and decision trees. As shown in a study that uses smartphone accelerometers to accurately identify everyday activities, RFs can manage enormous datasets. Deep learning models that automatically learn complicated feature representations, such as VA-fusion, AGC-LSTM, and LC-POSEGAIT, provide improved performance. These models capture minute differences in pedestrian behaviour using CNNs, RNNs, and LSTM architectures. Interpretability, generalization to new datasets, and computing demands are some of the difficulties they encounter. Future developments could include using transfer learning to improve performance in many circumstances, combining deep learning and expert systems for improved interpretability, and utilizing distributed computing for processing in an effective manner.