Plain Language SummaryThe objective of this study was to develop a tool for determining the smoking status of a person from their voice. Using data from Colive Voice, an international digital health study led by the Luxembourg Institute of Health, we investigated the impact of smoking on voice characteristics utilizing statistical methods. We then employed artificial intelligence algorithms to identify gender and language-specific digital vocal biomarkers, which are combinations of voice features associated, in the context of this project, with the outcome of smoking status. After analyzing data from 1,332 participants, we found differences in voice features between smokers and never-smokers, particularly among women. For example, the pitch and certain frequencies were lower in female smokers compared to never-smokers. We managed to differentiate between smokers and never-smokers with a 71% accuracy for women and 65% for men. This research demonstrates that smoking affects voice and that it is possible to predict its status using audio recorded in real-life settings. This tool could be valuable in clinical and research settings for studying smoking habits in a rapid and scalable manner.