Coastal tidal wetlands are crucial for environmental and economic health, but facing threats from various environmental changes. Detecting changes of tidal wetlands is essential for promoting sustainable development in coastal areas. Despite extensive researches on tidal wetland changes, persistent challenges still exist. Firstly, the high similarity among tidal wetland types hinders the effectiveness of existing common indices. Secondly, many current methods, relying on hand-crafted features, are time-consuming and subject to personal biases. Thirdly, few studies effectively integrate multi-temporal and semantic information, leading to misinterpretations from environmental noise and tidal variations. In view of the abovementioned issues, we proposed a novel temporal-spectral-semantic-aware convolutional transformer network (TSSA-CTNet) for multi-class tidal wetland change detection. Firstly, to address spectral similarity among different tidal wetlands, we proposed a sparse second order feature construction (SSFC) module to construct more separable spectral representations. Secondly, to get more separable features automatically, we constructed temporal-spatial feature extractor (TSFE) and siamese semantic sharing (SiamSS) blocks to extract temporal-spatial-semantic features. Thirdly, to fully utilize semantic information, we proposed a center comparative label smoothing (CCLS) module to generate semantic-aware labels. Experiments in the Greater Bay Area, using Landsat data from 2000 to 2019, demonstrated that TSSA-CTNet achieved 89.20% overall accuracy, outperforming other methods by 3.75%–16.39%. The study revealed significant area losses in tidal flats, mangroves, and tidal marshes, decreased by 3148 hectares, 35 hectares, and 240 hectares, respectively. Among the cities in GBA, Zhuhai shows the most significant area loss with a total of 1626 hectares. TSSA-CTNet proves effective for multi-class tidal wetland change detection, offering valuable insights for tidal wetland protection.