Abstract
PURPOSE: Cleft-lip surgery aims to restore oral functionality while striving to achieve normal lip aesthetics. Preoperative planning using anthropological landmarks of the lip guide surgeons through the process. However, identifying and placing these markings on the fine anatomy of the lip in children can be extremely difficult and can lead to compromised functional and aesthetic outcomes. The purpose of the study is to develop a novel approach to improve the accuracy of markings for cleft-lip surgery. To do so, we developed a machine-learning algorithm which reliably places anthropological landmarks on unilateral cleft-lip pictures in order to guide intraoperative markings. METHODS: We utilized High-Resolution Net (HRNet), a recent family of deep learning models that has achieved state of the art results in many computer-vision tasks, including facial landmark detection.1 HRNet follows the current trend in computer vision of stacking multiple convolutional layers2, but differs in one key area. Whereas previous models generally downsample the dimensionality of the input at each layer, HRNet performs this downsampling in parallel with a series of convolutional layers that preserves dimensionality, which allows for intermediate representations with higher dimensionality while simultaneously extracting lower dimension features. To adapt the facial landmark detection HRNet for our task, we employed transfer learning, a technique in machine-learning to transfer knowledge gained from a source task to a target task.3 Transfer learning has shown to dramatically reduce training time, increase accuracy on target task, and reduce required training examples in the target task. RESULTS: For model evaluation, we calculated error using the Normalized Mean Error (NME), an evaluation metric in facial landmark detection. Here, a craniofacial plastic surgeon manually marked 50 Mulliken unilateral cleft-lip images, and these images are compared against the detected markings assigned by our algorithm. After training on our dataset, we obtained a test NME of 0.1065. In comparison, the state of the art for facial point detection test NME in other datasets is in the range of 0.0385 (300 W) to 0.0460 (WFLW), but our training dataset size is about 1% the size of these benchmarks. These results illustrate the possibility of leveraging relatively small amounts of data to achieve surprisingly accurate labeling in cleft-lip annotations. CONCLUSION: In the present study, we developed a deep learning model which accurately places Mulliken unilateral cleft-lip markings on to preoperative photographs. We envision a national and international impact and believe that the usefulness will go beyond teaching residents as this technology can be used by cleft global outreach foundations as an instructional resource application for trainees. In the future, we plan on physically projecting these markings onto the surface of cleft-lips, using technology developed by our team, thereby overcoming discrepancies related to paper to 3-dimensional marking transfer. REFERENCES: 1. Cornell University. Deep high-resolution representation learning for visual recognition. Available at https://arxiv.org/abs/1908.07919. 2. Cornell University. Combining data-driven and model-driven methods for robust facial landmark detection. Available at https://arxiv.org/abs/1611.10152. 3. How transferable are features in deep neural networks? Available at https://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.