The Lhasa dialect, the most widely spoken Tibetan dialect in Tibet, is also renowned for its rich historical archive of written scripts. Exploring speech recognition methodologies specific to the Lhasa dialect is paramount in safeguarding Tibet’s distinct linguistic heritage. Previous studies in Tibetan speech recognition have been largely confined to academic research using nonpublic datasets, focusing on elements such as the selection of phone-level acoustic modeling units and the integration of tonal information. However, these studies have not significantly benefited the community due to the scarcity of available data. To mitigate the challenge posed by limited data resources, we present the NICT-Tib1 (phase 1) dataset, a new open-source dataset collected from native speakers dedicated to investigating speech recognition for the Lhasa dialect. Speech recognition with deep neural networks (DNNs) evolved three generations from systems hybrid with hidden Markov model (HMM) (e.g., DNN-HMM) to End-to-End systems (e.g., Transformer), and finally to self-supervised learning (SSL) systems (e.g., Wav2Vec2.0), each generation improving accuracy and simplifying the training process, with the latest generation achieving state-of-the-art performance, especially for low-resource languages. Besides the early DNN-HMM-based system using Kaldi, we further update benchmark systems with the Conformer and Wav2Vec2.0 trained by ESPnet and Huggingface on this dataset, respectively. Experimental results show that these state-of-the-art models outperformed the models in previous work.