The spectral properties are the most prevalent continuous representation for characterizing transport phenomena and excitation responses, yet their accurate predictions remain a challenge due to the inability to perceive series correlations by existing machine learning (ML) models. Herein, a ML model named cluster-based series graph networks (CSGN) is developed based on the dynamical theory of crystal lattices to predict phonon density of states (PDOS) spectrum for crystal materials. The multiple atomic cluster representation is constructed to capture the diverse vibration modes, while the mixture Gaussian process and dynamic time warping mechanism are compiled to project from clusters to PDOS spectrum. Accurate predictions of complicated spectra with multiple or overlapping peaks are achieved. The high performance of CSGN model can be attributed to the pertinent feature extraction and the appropriate similarity evaluation, which enable the natural perception of structure-property relation and intrinsic series correlations as confirmed in the predictive results. The transferable and interpretable CSGN model advances ML predictions of spectral properties and reveals the potential of designing ML methods based on physical mechanisms.