Increasing numbers of public and private locations now have surveillance cameras installed to make those areas more secure. Even though many organizations still hire someone to monitor the cameras, the person hired is more likely to miss some unexpected events in the video feeds because of human error. Several researchers have worked on surveillance data and have presented a number of approaches for automatically detecting aberrant events. To keep track of all the video data that accumulate, a supervisor is often required. To analyze the video data automatically, we recommend using neural networks to identify the crimes happening in the real world. Through our approach, it will be easier for police agencies to discover and assess criminal activity more quickly using our method, which will reduce the burden on their staff. In this paper, we aim to provide anomaly detection using surveillance videos as input specifically for the crimes of arson, burglary, stealing, and vandalism. It will provide an efficient and adaptable crime-detection system if integrated across the smart city infrastructure. In our project, we trained multiple accurate deep learning models for object detection and crime classification for arson, burglary and vandalism. For arson, the videos were trained using YOLOv5. Similarly for burglary and vandalism, we trained using YOLOv7 and YOLOv6, respectively. When the models were compared, YOLOv7 performed better with the highest mAP of 87. In this, we could not compare the model’s performance based on crime type because all the datasets for each crime type varied. So, for arson YOLOv5 performed well with 80% mAP and for vandalism, YOLOv6 performed well with 86% mAP. This paper designed an automatic identification of crime types based on camera or surveillance video in the absence of a monitoring person, and alerts registered users about crimes such as arson, burglary, and vandalism through an SMS service. To detect the object of the crime in the video, we trained five different machine learning models: Improved YOLOv5 for arson, Faster RCNN and YOLOv7 for burglary, and SSD MobileNet and YOLOv6 for vandalism. Other than improved models, we innovated by building ensemble models of all three crime types. The main aim of the project is to provide security to the society without human involvement and make affordable surveillance cameras to detect and classify crimes. In addition, we implemented the Web system design using the built package in Python, which is Gradio. This helps the registered user of the Twilio communication tool to receive alert messages when any suspicious activity happens around their communities.