In the clinical diagnosis of mood disorder, a large proportion of patients with bipolar disorder (BD) are misdiagnosed as having unipolar depression (UD). Generally, long-term tracking is required for patients with BD to conduct an appropriate diagnosis by using traditional diagnosis tools. A one-time diagnosis system for facilitating diagnosis procedures is thus highly desirable. Accordingly, in this study, the facial expressions of patients with BD, patients with UD, and healthy controls elicited by emotional video clips were used for conducting mood disorder classification; the classification was performed by exploring the temporal fluctuation characteristics among the three groups. First, macroscopic facial expressions characterized by action units (AUs) were applied for describing the temporal transformation of muscles. Modulation spectrum analysis was applied to extract short-term intensity variations in the AUs. An interval-based multilayer perceptron (MLP) neural network was then used to classify mood disorder on the basis of the detected AU intensities. Moreover, motion vectors (MVs) were employed to describe subtle changes in facial expressions in the microscopic view. Eight basic orientations of MV change were considered for representing microfluctuation. Wavelet decomposition was then applied to extract entropy and energy features in different frequency bands. A long short-term memory model was finally used to model long-term variations for conducting mood disorder classification. A decision-level fusion approach was conducted on the combined results of macroscopic and microscopic facial expressions. For evaluating the described methods, the facial expressions elicited from the 36 subjects (12 from each of the BD, UD, and control groups) were used in 12-fold cross-validation experiments. Approaches for macroscopic and microscopic expressions achieved classification accuracies of 63.9 and 66.7 percent, respectively, and the accuracy of the fusion approach reached 72.2 percent. The results indicate that macroscopic and microscopic view descriptors are complementary to each other and helpful for conducting mood disorder classification.