Simultaneous Localization and Mapping (SLAM) technologies are indispensable for indoor service robots, enabling them to navigate through and interact with environments. Visual SLAM systems often encounter significant challenges such as dynamic obstacles, variable lighting, feature scarcity, and perceptual aliasing in real-world scenarios. By merging the precise environmental mapping capabilities of visual SLAM with the ubiquity and stability of WiFi signals, our method effectively addresses the limitations typically associated with visual SLAM. Notably, our fusion technique leverages existing WiFi infrastructure, thus providing a cost-effective improvement in spatial awareness without the extensive offline database requirements of WiFi RSSI-based localization. Comparative performance evaluations highlight that our graph optimization-based approach not only surpasses the original ORBSLAM3 method but also significantly outperforms the Extended Kalman Filter (EKF) in terms of accuracy, particularly in environments characterized by poor lighting, feature-less scenes, and significant occlusions. This is evidenced by a reduced Root Mean Square Error (RMSE) in localization: 3.09m for our method versus 4.02m for EKF. This enhancement in precision underscores the potential of our integrated system to advance indoor navigation technologies, making it a crucial development in the field of robotics and automated systems.