Abstract: In the landscape of modern security infrastructure, video surveillance has evolved into a ubiquitous and indispensable tool for ensuring public safety and safeguarding critical assets. This research paper delves into an extensive examination of various deep learning architectures for the purpose of detecting suspicious activities in video surveillance. The investigation encompasses Convolutional Long Short-Term Memory (ConvLSTM), Convolutional Neural Network (CNN) combined with Long Short-Term Memory (LSTM), ConvLSTM, Bidirectional Long Short-Term Memory (BiLSTM), and diverse combinations thereof. Each model is rigorously trained and tested on a carefully curated dataset designed to encapsulate a spectrum of normal and suspicious activities like shooting and fighting. The study aspires to identify the most efficacious model for enhancing the accuracy of suspicious activity detection in complex and dynamic environments. Metrics such as accuracy, precision, recall, and F1 score will be rigorously assessed to ascertain the model’s comparative performances