This paper aims to estimate indoor occupancy given the real-time observed signals from the existing sensors; next, as a practical application, we build a dynamic control schedule for energy saving based on the estimated indoor occupancy. However, several issues need to be addressed. First, it is impossible to train the model with rich labels due to the expensive labeling cost. Second, manual annotation of the continuous occupancy rate is complex. Third, the mapping relationship between sensor data and occupancy will change in the long run. In this paper, we proposed a new algorithm named Self-Supervised Indoor Occupancy Estimation (SSIOE) to overcome the challenges. Specifically, our training scheme aims to (I) generate a set of pseudo labels in a simple way to mark the time periods believed to be either in a high or low occupancy state and (II) utilize these sparse labels for training a network to infer the continuous occupancy ratio. By reformulating the problem as a Wasserstein-distance-like estimation, SSIOE is a novel learning-based method that can rely only on the weak/sparse labels of either “high-occupancy” or “low-occupancy” and learn to estimate the continuous occupancy ratio. Furthermore, to deal with the scarce annotation problem, we proposed a novel physical constraint loss to model the physical prior. (III) Last but not least, to strengthen the adaptability, we integrated Model-Agnostic Meta-Learning (MAML) to train the model for dynamic model updating. Experimental results show that SSIOE can provide reliable occupancy estimation and flexibly adapt to various control modes without retraining the model. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —This paper is motivated by the real-time occupancy estimation problem, which is crucial for building management systems to arrange lighting and air conditioning. A dynamic schedule adapted to the occupancy would help balance the user’s comfort and energy saving; however, installing the occupancy sensors raises additional costs. Therefore, this paper proposes SSIOE, a label-free machine learning algorithm that can learn to estimate the continuous occupancy ratio given the sensor status in real time. Prior works trained upon supervised machine learning methods suffer from heavy manual annotation workload, while the unsupervised machine learning approaches fail to provide continuous occupancy estimation values. This paper addresses the above challenges by proposing a novel self-supervised training scheme. Preliminary experiments suggest that this approach is feasible to work in real-world buildings. However, it should be noted that the occupancy estimated by SSIOE is a ratio (0.0-1.0), which is relative to the maximum and minimum occupancy of the training data (or fine-tuning data). For example, if the maximum number of occupants in the training data is 200 and the minimum number of occupants is 50, then the estimated occupancy value of 1.0 is about 200 occupants, and the occupancy value of 0.0 is about 50 occupants. When the maximum and minimum occupants of the space change, the model needs to be updated to capture the new mapping of the estimated occupancy ratio to the number of occupants. In future research, we plan to further improve and simplify the adaptation ability of SSIOE by continuously and automatically updating the system with the latest collected as the system runs.