Previous studies have not reached a consensus on the number of phonemic vowels in Mandarin, with proposed systems ranging from four to six vowels. Since traditional top-down approaches have not led to a unified conclusion, this study employs a novel method using cluster analysis on ultrasound lingual images of Taiwan Mandarin vowels, hoping this bottom-up approach offers new insights into the debate. A total of 2700 tokens of consonant-vowel (CV) and isolated vowel (V) syllables were recorded from a female native speaker of Taiwan Mandarin. Preprocessed ultrasound images from vowel midpoints were analyzed using K-means clustering. To determine the optimal number of clusters, the elbow method and KneeLocator from the kneed package were used. Although KneeLocator suggested four clusters as optimal, the elbow plot did not show a clear elbow point. This mirrors the inconclusive results of previous studies on the Mandarin vowel system. Similar to the elbow plot, the clustering visualization using Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP) did not show distinct clustering patterns, indicating potential limitations of K-means clustering for our data. Future research may benefit from exploring alternative clustering algorithms and preprocessing techniques, such as noise reduction in ultrasound images. This preliminary study shows the potential of using cluster analysis on ultrasound images for phonetic research and highlights on the challenges in defining Mandarin vowel inventory.
Read full abstract