This paper presents a production-oriented 4D facial reconstruction pipeline designed to produce high-fidelity facial mesh sequences with a consistently structured topology, while preserving the wireframe structure specified by artists. We have designed and developed a compact, efficient, and fast optical capture system based on synchronized camera arrays for high-precision dynamic 3D facial imaging. Unlike prevailing methods that primarily concentrate on single-frame reconstruction, often reliant on labor-intensive manual annotation, our framework exploits the constraint of appearance consistency to autonomously establish feature correspondence and uphold temporal coherence within the mesh. Consequently, our approach eliminates mesh drifting and jitter, enabling full parallelization for dynamic facial expression capture. The proposed pipeline decouples the non-linear deformation of facial expressions from the rigid movements of the skull through a stable external device. Leveraging progressive retopology, our methodology employs artist-guided templates as priors, ensuring the preservation of wireframe structures across the result sequence. Progressive retopology is achieved by constraining different fine-grained features of 3D landmarks, scan surface shapes, and appearance textures. The results of our study showcase facial mesh sequences with production-quality topology, adept at faithfully reproducing character expressions from photographs while achieving artist-friendly stable facial movements.