The You Only Look Once (YOLO) object detection network has garnered widespread adoption in various industries, owing to its superior inference speed and robust detection capabilities. This model has proven invaluable in automating production processes such as material processing, machining, and quality inspection. However, as market competition intensifies, there is a constant demand for higher detection speed and accuracy. Current FPGA accelerators based on 8-bit quantization have struggled to meet these increasingly stringent performance requirements. In response, we present a novel 4-bit quantization-based neural network accelerator for the YOLOv5 model, designed to enhance real-time processing capabilities while maintaining high detection accuracy. To achieve effective model compression, we introduce an optimized quantization scheme that reduces the bit-width of the entire YOLO network-including the first layer-to 4 bits, with only a 1.5% degradation in mean Average Precision (mAP). For the hardware implementation, we propose a unified Digital Signal Processor (DSP) packing scheme, coupled with a novel parity adder tree architecture that accommodates the proposed quantization strategies. This approach efficiently reduces on-chip DSP utilization by 50%, offering a significant improvement in performance and resource efficiency. Experimental results show that the industrial object detection system based on the proposed FPGA accelerator achieves a throughput of 808.6 GOPS and an efficiency of 0.49 GOPS/DSP for YOLOv5s on the ZCU102 board, which is 29% higher than a commercial FPGA accelerator design (Xilinx's Vitis AI).