Accurate segmentation of the thyroid gland in ultrasound images is an essential initial step in distinguishing between benign and malignant nodules, thus facilitating early diagnosis. Most existing deep learning-based methods to segment thyroid nodules are learned from only a single view or two views, which limits the performance of segmenting nodules at different scales in complex ultrasound scanning environments. To address this limitation, this study proposes a multi-view learning model, abbreviated as MLMSeg. First, a deep convolutional neural network is introduced to encode the features of the local view. Second, a multi-channel transformer module is designed to capture long-range dependency correlations of global view between different nodules. Third, there are semantic relationships of structural view between features of different layers. For example, low-level features and high-level features are endowed with hidden relationships in the feature space. To this end, a cross-layer graph convolutional module is proposed to adaptively learn the correlations of high-level and low-level features by constructing graphs across different layers. In addition, in the view fusion, a channel-aware graph attention block is devised to fuse the features from the aforementioned views for accurate segmentation of thyroid nodules. To demonstrate the effectiveness of the proposed method, extensive comparative experiments were conducted with 14 baseline methods. MLMSeg achieved higher Dice coefficients (92.10% and 83.84%) and Intersection over Union scores (86.60% and 73.52%) on two different thyroid datasets. The exceptional segmentation capability of MLMSeg for thyroid nodules can greatly assist in localizing thyroid nodules and facilitating more precise measurements of their transverse and longitudinal diameters, which is of significant clinical relevance for the diagnosis of thyroid nodules.