Currently, wireless communication systems that use radio frequency are commonly deployed, for example, mobile communication systems, satellite systems, and the Internet of Things (IoT) systems. Based on their easy installation, wireless communication systems have benefits over other wired communication systems. However, using high frequencies to transfer data via wireless communication can hold significant risks for human health. Several researchers have studied this topic using visible light instead of Radio Frequency (RF) waveforms in communication systems. Many potential approaches are relevant in this regard, i.e., visible light communication, light fidelity, free-space optical, and optical camera communication. Artificial intelligence is also influencing the future of industry and people and is used to solve complex problems, create intelligent solutions, and replace human intelligence as the driving force behind emerging technologies such as big data, smart factories, and the IoT. In this paper, we proposed the architecture of the MIMO C-OOK (Multiple-Input Multiple-Output Camera On–Off Keying) scheme, which uses a convolutional neural network for light-emitting diode detection and a deep learning neural network for threshold predictions considering long-distance communication and mobility support. Our suggested method aimed to improve the performance of the traditional camera on–off keying scheme by increasing data rate, communication distance, and low bit error rate. Our suggested technique may achieve a communication distance of up to 22 m with a low error rate when considering the mobility impact (2 m/s, i.e., walking velocity) by controlling the exposure time, focal length, and employing Forward Error Correction code.