In the process of product assembly, parts need to be assembled in a given assembly sequence. Failure to detect the correctness of the newly assembled parts in time can affect the quality and assembly efficiency of products. For effective detection of newly assembled parts in the assembly process from different viewing angles, this study applies scene change detection to mechanical assembly monitoring for the first time and proposes a mechanical assembly monitoring method based on depth image multiview change detection. This method includes a semantic fusion network and an attention-based feature extraction network (AFE Net) for multiview change detection. To make this method suitable for the change detection of the mechanical assembly, this study involves the following innovations. 1) Considering the assembly parts with a single color, symmetry, and no texture, this study employs depth images as the input of the semantic fusion network. Subsequently, the semantic segmentation network is used to segment the parts on the depth images for the generation of color semantic segmentation images. Then, the semantic segmentation images and depth images are merged and input into the multiview change detection network. 2) In the multiview change detection network, an attention-based feature extraction module is designed to rapidly focus on the key information of the current task from a large amount of input information and improve the processing efficiency and accuracy of the task. Furthermore, through up-sampling, the size of the feature map is unified, and high-dimensional semantic information and low-dimensional spatial information are merged, which effectively increases the amount of feature information. 3) To verify the effectiveness of the multiview change detection of the assembly, this study establishes a multiview assembly process dataset and evaluates the proposed method using this dataset. The results show that, compared with other change detection networks, the comprehensive index F1 of the method based on the abovementioned dataset reaches an optimal value of 96.9% while consuming less time and with clearer boundary processing. Overall, the proposed network structure is suitable for the change detection of the mechanical assembly and can also be applied to multiview monitoring in product assembly.