Abstract Background and Aims Genetic insights are becoming increasingly influential in the understanding and treatment of various kidney diseases (KD). Hundreds of genes associated with monogenic kidney disease have been identified, providing valuable insights into their diagnosis, management, and monitoring. However, the lack of a unified and standardized database of genes assigned to kidney diseases has led to diagnostic blind spots and comparability issues among current studies of kidney genetics. To address this gap, we created “Kidney-Genetics”, a regularly updated, automated and publicly accessible database, which aims to provide a comprehensive list of all relevant genes associated with kidney disease. Methods Kidney disease-associated gene information was compiled from various sources, including: (1) Genomics England and Australia PanelApp [1], (2) a comprehensive literature review of published gene lists, (3) clinical diagnostic panels for kidney disease, (4) a Human Phenotype Ontology (HPO)-based [2] search in rare disease databases (OMIM, Orphanet), and (5) a PubTator [3] API-based automated literature extraction from PubMed. An evidence-scoring system was developed to distinguish high- evidence genes from low evidence genes and candidate genes. High-evidence genes were defined as those present in two or more of the five resources and were manually curated based on predetermined criteria or, in the case of existing ClinGen [4] curation, their data and scores were used. Genes with a score of one or less were accordingly classified as candidate genes. Additionally, genes were grouped into different categories for later genotype-phenotype correlation matching (see Fig. 1). Results The Kidney-Genetics database currently contains detailed information on 3025 kidney-associated genes with detailed annotations on gene expression, kidney phenotype, inheritance mode, disease onset, possible syndromic disease mode and genetic variation. The number of genes extracted from the five analyzed sources of information are as follows: (1) 534, (2) 822, (3) 956, (4) 789, and (5) 2158 (see Fig. 2). Notably, 598 genes (19.8%) of the total 3025 genes are present in two or more of the analyzed information sources, thus meeting our evidence criteria, indicating high confidence and their potential for diagnostic use. Of these high-evidence genes, 526 (88.0%) are present in at least one, and 56 (9.4%) are present in all 10 comprehensive diagnostic laboratory panels. To ensure currency, Kidney-Genetics will be updated regularly and automatically on a monthly basis. We will also provide phenotypic and functional clustering results to facilitate gene grouping. Conclusion By utilizing Kidney-Genetics, clinicians, geneticists, and researchers can examine genomic data and improve their understanding of the genetic components of diverse KDs. The code and results are freely available on GitHub (kidney-genetics.org) [5]. A standardized pipeline and automated system keep our database on the cutting edge of kidney research and diagnostics. Screening efforts toward manual curation (such as through the ClinGen initiative) and assignment of diagnostic genes to kidney disease groups (e.g., syndromic vs. isolated; adult vs. pediatric; cystic, nephrotic, etc.) are currently under development.
Read full abstract