In manufacturing industry, one of the main targets is to increase automation and ultimately to avoid failures under all circumstances. The plugging and locking of connectors is a class of tasks which is yet hard to be automatized with sufficiently high process stability. Due to the variation of plugging positions and external disturbances, e.g. occlusion due to cables, the quality assessment of plugging processes has emerged as a challenging task for image-based systems. For this reason, the proposed approach analyzes the inherent acoustic connector locking properties in combination with different neural network architectures in order to correctly identify connector locking signals and further to distinguish them from other machining events occurring in assembly plants. For this specific task, highly sensitive optical microphones have been applied for data acquisition. The proposed experiments are carried out under laboratory conditions as well as for the more complex situation in a real manufacturing environment. In this context, the usage of multimodal neural network architectures achieved highest levels in classification performance with accuracy levels close to 90%.