Abstract Study question How far the existing and emerging EU regulations address concerns about lack of diversity in genomic databases which used to train AI in genomics? Summary answer The main regulations in the European Union provide no clear answers for reconciling the need for more diverse genetic data with regulatory standards. What is known already AI researchers need high quality genomic data that reflect the underlying diversity of human populations to develop and validate safe, effective, and equitable tools for genomic medicine. The problem is that most of the genomic data that are available to develop such tools are of insufficient quality, due to inappropriate use of population descriptors, and over representative of individuals of European ancestry. We used the example of developing algorithms to calculate polygenic risk scores ( PRS) and concerns about their accuracy for various sub-populations to illustrate our discussion. Study design, size, duration We have analysed the relevant EU regulations including, the upcoming AI Act, Medical Devices Regulation ( MDR), In Vitro Diagnostic Medical Devices ( IVDR) and the GDPR to assess how the concerns related to lack of diversity in genetic/genomic databases have been addressed. Participants/materials, setting, methods NA. Main results and the role of chance The main regulations for clinical AI in the European Union, namely those regulating medical devices, provide no clear answers for reconciling the need for more diverse genetic data with regulatory standards. On the other hand, some of the data protection regulation requirements arising from the EU General Data Protection Regulation (GDPR) may further challenge collection of sensitive attributes such as race/ancestry which might be needed for brining diversity to databases. On the other hand, several interesting impending new regulations addressing AI and algorithms such as the EU’s AI Act may help in some ways, but still fail to ultimately grapple with the fundamental obstacles in producing the diverse genomic data needed to achieve regulatory objectives. In particular, the proposed requirements related to data quality and transparency need to be accompanied by further guidance on how to define representativeness and how to address long-term impact on fairness concerns in the context of medical AI, when the developed tools are not of optimal use for populations not descending from European ancestry. Limitations, reasons for caution NA. Wider implications of the findings The legal analysis provided in this paper will provide insights for broader types of AI tools which may be implemented in the context of Assisted Reproductive Technologies and beyond. Trial registration number NA
Read full abstract