Tracking the 3D shape of a deforming object using only monocular 2D vision is a challenging problem. This is because one should (i) infer the 3D shape from a 2D image, which is a severely underconstrained problem, and (ii) implement the whole solution pipeline in real time. The pipeline typically requires feature detection and matching, mismatch filtering, 3D shape inference and feature tracking algorithms. We propose ROBUSfT, a conventional pipeline based on a template containing the object's rest shape, texture map and deformation law. ROBUSfT is ready-to-use, wide-baseline, capable of handling large deformations, fast up to 30 fps, free of training, and robust against partial occlusions and discontinuities. It outperforms the state-of-the-art methods in challenging video datasets. ROBUSfT is implemented as a publicly available C++ library. We provide the code, a tutorial on how to use it, and a supplementary video of our experiments at https://github.com/mrshetab/ROBUSfT.