Radio-based localization is an active research topic with a range of applications. In this paper, we focus on localization of a radio receiver equipped with an inertial measurement unit. The localization is performed while simultaneously constructing a map of the small-scale fading pattern in the local radio environment. The map in our case is a ray-trace-based multipath channel model. This solution is enabled by sensor fusion of information from the channel estimation data and the inertial sensors, and it does not assume knowledge of, e.g., transmitter locations. The sensor data are fused in a recursive state space model that combines the kinematic motion model with the ray-based radio channel model, and the state vector is estimated using a particle filter. The choice of the particle filter is justified by the multimodal characteristics of the posterior likelihood distributions that follow from the nonlinearities of the problem. The work is assuming a single receiver antenna, but the approach can be transferred to multiple-antenna systems. We study the performance of the approach under realistic assumptions, based on the performance of today’s low-cost inertial sensors and radio systems, including accelerometer and gyroscope noise and radio receiver frequency error and noise. Simulations show significant improvement in long-term positioning performance, evaluated against dead reckoning. The work is concluded with experiments that serve as the proof of concept for the proposed technique, using no extra equipment compared to what can be found in a modern cellular phone.