Accelerometers are commonly used to measure vibrations for condition monitoring in mechanical and civil structures; however, their high cost and point-based measurement approach present practical limitations. With rapid advancements in computer vision and deep learning, research into tracking the motion of individual pixels with vision cameras has increased. The recently developed CoTracker, a transformer-based model, has demonstrated excellence in motion tracking, yet its performance in measuring structural vibrations has not been fully explored. This paper investigates the efficacy of the CoTracker model in extracting full-field structural vibrations using cameras. It is initially applied to capture the dense point movements in video sequences of a cantilever beam recorded using a high-speed camera. Subsequently, modal analysis using delay-embedding dynamic mode decomposition (DMD) is conducted to extract modal parameters including natural frequencies, damping ratios, and mode shapes. The results, benchmarked against those from a reference accelerometer and the Finite Element Method (FEM) result, demonstrate CoTracker's high potential for general applicability in structural vibration measurements.
Read full abstract