Efficiency of Computer-Aided Facial Phenotyping (DeepGestalt) in Individuals With and Without a Genetic Syndrome: Diagnostic Accuracy Study.

Jean Tori Pantel,Angela Teresa Abad-Perez,Stefan Mundlos,Peter Hansen,Nurulhuda Hajjir,Jonas Elsner,Claus-Eric Ott,Malte Spielmann,Denise Horn,Martin Atta Mensah,Magdalena Danyel

doi:10.2196/19263

Jean Tori Pantel, Angela Teresa Abad-Perez + Show 9 more

Open Access

https://doi.org/10.2196/19263

Copy DOI

Abstract

BackgroundCollectively, an estimated 5% of the population have a genetic disease. Many of them feature characteristics that can be detected by facial phenotyping. Face2Gene CLINIC is an online app for facial phenotyping of patients with genetic syndromes. DeepGestalt, the neural network driving Face2Gene, automatically prioritizes syndrome suggestions based on ordinary patient photographs, potentially improving the diagnostic process. Hitherto, studies on DeepGestalt’s quality highlighted its sensitivity in syndromic patients. However, determining the accuracy of a diagnostic methodology also requires testing of negative controls.ObjectiveThe aim of this study was to evaluate DeepGestalt's accuracy with photos of individuals with and without a genetic syndrome. Moreover, we aimed to propose a machine learning–based framework for the automated differentiation of DeepGestalt’s output on such images.MethodsFrontal facial images of individuals with a diagnosis of a genetic syndrome (established clinically or molecularly) from a convenience sample were reanalyzed. Each photo was matched by age, sex, and ethnicity to a picture featuring an individual without a genetic syndrome. Absence of a facial gestalt suggestive of a genetic syndrome was determined by physicians working in medical genetics. Photos were selected from online reports or were taken by us for the purpose of this study. Facial phenotype was analyzed by DeepGestalt version 19.1.7, accessed via Face2Gene CLINIC. Furthermore, we designed linear support vector machines (SVMs) using Python 3.7 to automatically differentiate between the 2 classes of photographs based on DeepGestalt's result lists.ResultsWe included photos of 323 patients diagnosed with 17 different genetic syndromes and matched those with an equal number of facial images without a genetic syndrome, analyzing a total of 646 pictures. We confirm DeepGestalt’s high sensitivity (top 10 sensitivity: 295/323, 91%). DeepGestalt’s syndrome suggestions in individuals without a craniofacially dysmorphic syndrome followed a nonrandom distribution. A total of 17 syndromes appeared in the top 30 suggestions of more than 50% of nondysmorphic images. DeepGestalt’s top scores differed between the syndromic and control images (area under the receiver operating characteristic [AUROC] curve 0.72, 95% CI 0.68-0.76; P<.001). A linear SVM running on DeepGestalt’s result vectors showed stronger differences (AUROC 0.89, 95% CI 0.87-0.92; P<.001).ConclusionsDeepGestalt fairly separates images of individuals with and without a genetic syndrome. This separation can be significantly improved by SVMs running on top of DeepGestalt, thus supporting the diagnostic process of patients with a genetic syndrome. Our findings facilitate the critical interpretation of DeepGestalt’s results and may help enhance it and similar computer-aided facial phenotyping tools.

Highlights

Background individual genetic diseases are rare, they collectively affect an estimated 5% of a population [1]
We designed linear support vector machines (SVMs) using Python 3.7 to automatically differentiate between the 2 classes of photographs based on DeepGestalt's result lists
DeepGestalt’s top scores differed between the syndromic and control images

Summary

Introduction

Background individual genetic diseases are rare, they collectively affect an estimated 5% of a population [1]. Several studies assessed the algorithm's sensitivity, suggesting that it is of a certain quality [11,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38] These tests predominantly analyzed images of patients diagnosed with a genetic disorder known to show characteristic facial features. Determining the accuracy of a diagnostic methodology requires testing of negative controls

Objectives

Methods

Results

Discussion

Conclusion