Obtaining pedestrian trajectories by a vision-based methodology is receiving increasing attention in the literature over recent decades. Within the field of study of human-induced vibrations on footbridges, practical challenges arise when collecting the trajectories of high-density crowds during measurement campaigns. A cheap and robust methodology tackling these issues is presented and applied on a case study consisting of a real-life footbridge occupied with many pedestrians. A static camera setup consisting of low-cost action cameras with limited installation height is used. In addition, a drone camera was employed to collect a limited amount of footage. Pedestrians are equipped with colored hats and detected using a straightforward color-segmenting approach. The measurements are subjected to both systematic and random measurement errors. The influence of the former is theoretically investigated and is found to be of limited importance. The effect of the latter is minimized using a Kalman filter and smoother. A thorough assessment of the accuracy results reveals that the remaining uncertainty is in the order of magnitude of 2 to 3 cm, which is largely sufficient for the envisaged purpose. Although the methodology is applied on a specific case study in the present work, the conclusions regarding the obtained accuracy and employability are generic since the measurement setup can be extended to a footbridge with virtually any length. Moreover, the empirically obtained results of the presented case study should find use in the calibration of pedestrian dynamic models that describe the flow of high-density crowds on footbridges and the further development of load models describing crowd-induced loading.