Smart factory is the main keyword in the field of manufacturing processes about the fourth industrial revolution. To realize the smart factory, making all pieces of device into smart devices that are connected to the centralized system to enable a real-time exchange of information is essential. Sound can be efficient means to make devices as smart devices because sound can contain the status information of various devices simultaneously, and it can be recorded easily outside of a device using only a microphone. In this study, multi-device operation monitoring system by analyzing sound is developed. Mic arrays for acquiring the sound were installed at the outside the devices and recorded the sounds from several devices simultaneously. By analyzing the recorded sound with log-mel spectrogram and Convolutional Neuron Network (CNN), the system could detect the operational status of three devices with an accuracy of 71–92 %. To improve the performance, virtual data set was created by composition of individual device operating sounds of different intensities. With this virtual data set, accuracy can be enhanced to 87 % ∼ 99 % accuracy and, required sound data amount could be reduced. Developed system was applied successfully in monitoring experiments in two different environments: a workshop in which hand-operated device was used and a factory with a computer numerical control machine and verifying the performance.