Sensorless freehand 3D ultrasound (US) reconstruction based on deep networks shows promising advantages, such as large field of view, relatively high resolution, low cost, and ease of use. However, existing methods mainly consider vanilla scan strategies with limited inter-frame variations. These methods thus are degraded on complex but routine scan sequences in clinics. In this context, we propose a novel online learning framework for freehand 3D US reconstruction under complex scan strategies with diverse scanning velocities and poses. First, we devise a motion-weighted training loss in training phase to regularize the scan variation frame-by-frame and better mitigate the negative effects of uneven inter-frame velocity. Second, we effectively drive online learning with local-to-global pseudo supervisions. It mines both the frame-level contextual consistency and the path-level similarity constraint to improve the inter-frame transformation estimation. We explore a global adversarial shape before transferring the latent anatomical prior as supervision. Third, we build a feasible differentiable reconstruction approximation to enable the end-to-end optimization of our online learning. Experimental results illustrate that our freehand 3D US reconstruction framework outperformed current methods on two large, simulated datasets and one real dataset. In addition, we applied the proposed framework to clinical scan videos to further validate its effectiveness and generalizability.
Read full abstract