In this paper we propose a reliable acoustic indoor positioning system fully compatible with a conventional smartphone. The proposed system takes advantage of the smartphone audio I/O and its processing capabilities to perform acoustic ranging in the audio band using non-invasive audio signals and it has been developed having in mind applications that require high accuracy, such as augmented or virtual reality, gaming or audio guiding applications. The system works in a distributed operation mode, i.e. each smartphone is able to obtain its own position using only acoustic signals. In order to support the positioning system, a Wireless Sensor Network (WSN) of synchronized acoustic beacons is proposed. To keep the infrastructure in sync we developed an Automatic Time Synchronization and Syntonization (ATSS) protocol that ensures a sync offset error below 5 μs. Using an improved Time Difference of Arrival (TDoA) estimation approach (that takes advantage of the beacon signals’ periodicity) and by performing Non-Line-of-Sight (NLoS) mitigation, we were able to obtain very stable and accurate position estimates with an absolute error of less than 10 cm in 95% of the cases and a mean absolute standard deviation of 2.2 cm for a position refresh period of 350 ms.