Due to sensor failure, noise interference and other factors, the data collected in the structural health monitoring (SHM) system will show a variety of abnormal patterns, which will bring great uncertainty to the structural safety assessment. This paper proposes an automatic data anomaly diagnosis method for SHM based on a multimodal deep neural network. In order to improve the detection accuracy, both two-dimensional and one-dimensional features of the sensor data are fused in the multimodal deep neural network. The network consists of two convolutional neural network (CNN) channels, one a 2D-CNN channel for extracting time–frequency features of sensor data and the other a 1D-CNN channel for extracting raw one-dimensional features of sensor data. After convolution and pooling operations for the sensor data by the 2D channel and 1D channel separately, the two types of extracted features are flattened into one-dimensional vectors and concatenated at the concatenation layer. The concatenated vector is then fed into fully connected layers for final SHM data anomaly classification. In order to evaluate the reliability of the proposed method, the monitored data lasting for one month of a long-span cable-stayed bridge were used for training, validation, and testing. Six types of training conditions (missing, minor, outlier, over-range oscillation, trend, and drift) are studied and analyzed to address the issue of imbalanced training data. With an accuracy rate of 95.10%, the optimal model demonstrates the effectiveness and capability of the proposed method. The proposed method shows a promising future as a reliable AI-assisted digital tool for safety assessment in structural health monitoring systems.