In order to enhance the stability of vehicle pose estimation within driving videos, a novel methodology for optimizing vehicle structural parameters is introduced. This approach hinges on evaluating the reliability of edge point sequences. Firstly, a multi−task and iterative convolutional neural network (MI−CNN) is constructed, enabling the simultaneous execution of four critical tasks: vehicle detection, yaw angle prediction, edge point location, and visibility assessment. Secondly, an imperative aspect of the methodology involves establishing a local tracking search area. This region is determined by modeling the limitations of vehicle displacement between successive frames. Vehicles are matched using a maximization approach that leverages point similarity. Finally, a reliable edge point sequence plays a pivotal role in resolving structural parameters robustly. The Gaussian mixture distribution of vehicle distance change ratios, derived from two measurement models, is employed to ascertain the reliability of the edge point sequence. The experimental results showed that the mean Average Precision (mAP) achieved by the MI−CNN network stands at 89.9%. A noteworthy observation is that the proportion of estimated parameters whose errors fall below the threshold of 0.8 m consistently surpasses the 85% mark. When the error threshold is set at less than 0.12 m, the proportion of estimated parameters meeting this criterion consistently exceeds 90%. Therefore, the proposed method has better application status and estimation precision.