Defect detection in mobile phone cameras constitutes a critical aspect of the manufacturing process. Nonetheless, this task remains challenging due to the complexities introduced by intricate backgrounds and low-contrast defects, such as minor scratches and subtle dust particles. To address these issues, a Bilateral Feature Fusion Network (BFFN) has been proposed. This network incorporates a bilateral feature fusion module, engineered to enrich feature representation by fusing feature maps from multiple scales. Such fusion allows the capture of both fine and coarse-grained details inherent in the images. Additionally, a Self-Attention Mechanism is deployed to garner more comprehensive contextual information, thereby enhancing feature discriminability. The proposed Bilateral Feature Fusion Network has been rigorously evaluated on a dataset of 12,018 mobile camera images. Our network surpasses existing state-of-the-art methods, such as U-Net and Deeplab V3+, particularly in mitigating false positive detection caused by complex backgrounds and false negative detection caused by slight defects. It achieves an F1-score of 97.59%, which is 1.16% better than Deeplab V3+ and 0.99% better than U-Net. This high level of accuracy is evidenced by an outstanding precision of 96.93% and recall of 98.26%. Furthermore, our approach realizes a detection speed of 63.8 frames per second (FPS), notably faster than Deeplab V3+ at 57.1 FPS and U-Net at 50.3 FPS. This enhanced computational efficiency makes our network particularly well-suited for real-time defect detection applications within the realm of mobile camera manufacturing.