Constructing a visual appearance model is essential for visual tracking. However, relying only on the visual model during appearance changes is insufficient and may even interfere with achieving good results. Although several visual tracking algorithms emphasize motional tracking that estimates the motion state of the object center between consecutive frames, they suffer from accumulated error during runtime. As neither visual nor motional trackers are capable of performing well separately, several groups have recently proposed simultaneous visual and motional tracking algorithms. However, because tracking problems are often NP-hard, these algorithms cannot provide good solutions for the reason that they are driven top-down with low flexibility and often encounter drift problems. This paper proposes a spiral visual and motional tracking (SVMT) algorithm which, unlike existing algorithms, builds a strong tracker by cyclically combining weak trackers from both the visual and motional layers. In the spiral-like framework, an iteration model is used to search for the optimum until convergence, with the potential for achieving optimization. Three learned procedures including visual classification, motional estimation, and risk analysis are integrated into the generalized framework and implement corresponding modifications with regard to their performances. The experimental results demonstrate that SVMT performs well in terms of accuracy and robustness.