• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Schema Mapping Research Articles (Page 1)

  • Share Topic
  • Share on Facebook
  • Share on Twitter
  • Share on Mail
  • Share on SimilarCopy to clipboard
Follow Topic R Discovery
By following a topic, you will receive articles in your feed and get email alerts on round-ups.
Overview
253 Articles

Published in last 50 years

Related Topics

  • Relational Database Schema
  • Relational Database Schema
  • Database Schema
  • Database Schema
  • Target Schema
  • Target Schema
  • Global Schema
  • Global Schema
  • XML Database
  • XML Database

Articles published on Schema Mapping

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
246 Search results
Sort by
Recency
  • Research Article
  • 10.52783/jisem.v10i58s.12669
Adaptive AI-Driven Enterprise Integration Framework: Intelligent Schema Mapping and Predictive Quality Management Flow
  • Aug 20, 2025
  • Journal of Information Systems Engineering and Management
  • Ashutosh Rana

This article introduces an AI-driven enterprise integration framework that addresses critical challenges in modern integration environments by combining intelligent schema mapping with predictive quality management. The framework represents a significant advancement over traditional integration approaches that suffer from mapping brittleness, high maintenance costs, and reactive quality control. Through a three-layer mapping methodology incorporating syntactic pre-matching, semantic embedding alignment, and ontology-based reasoning, the system achieves superior mapping accuracy while dramatically reducing manual effort. The predictive quality management component utilizes machine learning to forecast potential integration failures before they occur, implementing risk-based transaction handling through a Quality Risk Score calculation that enables preemptive interventions. Improvement in development and maintenance effort, as well as failure rates, is dramatic and occurs across a wide range of enterprise environments when assessed comprehensively. Although there are challenges in the implementation of knowledge graph bootstrapping, model training, and change management, the framework has promised a persuasive technical and practical improvement potential that signals future continuous improvement through the unification of multi-agent, self-healing pipelines and federated learning paradigms.

  • Research Article
  • 10.32996/jcsts.2025.7.7.97
Data Quality and Integration: The AI-Driven Evolution
  • Jul 22, 2025
  • Journal of Computer Science and Technology Studies
  • Soma Sundar Reddy Kancharla

This article examines the transformative impact of artificial intelligence on data quality and integration practices across modern enterprises. Traditional rule-based approaches to data validation and integration are increasingly insufficient for addressing the complexity, volume, and velocity of contemporary data ecosystems. The emergence of AI-driven techniques—including automated anomaly detection, intelligent data profiling, adaptive schema mapping, and natural language processing for metadata management—represents a paradigm shift in how organizations ensure data integrity and seamless information flow. The article demonstrates how machine learning approaches offer superior adaptability and accuracy compared to conventional methods. Industry case studies across healthcare, finance, and manufacturing illustrate the practical benefits of AI-enhanced data management, including reduced integration times, improved quality metrics, and enhanced decision support capabilities. The article identifies key challenges in semantic consistency, scalability across heterogeneous environments, and ethical governance of increasingly autonomous data systems. Looking forward, the potential for self-healing data frameworks and federated approaches to cross-organizational quality management suggests a future where data infrastructure becomes not merely a passive repository but an intelligent, adaptive foundation for organizational knowledge and decision-making.

  • Research Article
  • 10.1080/13658816.2025.2533322
Monkuu: a LLM-powered natural language interface for geospatial databases with dynamic schema mapping
  • Jul 17, 2025
  • International Journal of Geographical Information Science
  • Chenglong Yu + 10 more

Geospatial databases present significant accessibility challenges due to the complexity of structured query languages. To enable intuitive human-system interactions via natural language, this paper presents Monkuu, a novel natural language-to-SQL interface specifically designed for geospatial databases. Monkuu integrates a dynamic context-aware schema mapping mechanism to align database schemas, effectively overcoming information truncation issues common in traditional Retrieval-Augmented Generation methods. Additionally, a human-in-the-loop geographic disambiguation workflow is introduced to resolve complex place names by combining multi-source geographic data. Monkuu achieves 56.2% execution accuracy on the KaggleDBQA benchmark, improving upon the leading ZeroNL2SQL model by 13.8 percentage points, alongside an 82.4% recall in geographic ambiguity resolution on the GeoQueryJP dataset. The system’s primary contribution lies in its robust database access capabilities with clean data interfaces for downstream spatial analysis tools while maintaining focus on accurate query translation. Case studies demonstrate its effectiveness in processing queries like ‘Show me the boundary of Kashiwa’ into executable SQL, significantly lowering technical barriers for non-expert users. This work advances equitable and accessible geographic information services.

  • Research Article
  • 10.55041/ijsrem51315
Intelligent Database Migration Using DB Genie: A Machine Learning-Driven Approach
  • Jul 11, 2025
  • INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
  • Aman Patnayak + 1 more

The rapid evolution of Artificial Intelligence (AI) has transformed data management in modern databases, offering innovative solutions to longstanding challenges. This thesis explores AI’s role in enhancing data organization, storage, retrieval, and analysis. Traditional database systems face limitations in handling the growing volume, velocity, and variety of data, often resulting in performance and scalability issues. AI techniques—such as machine learning, natural language processing, and neural networks—address these problems by automating complex processes, optimizing query execution, and enabling predictive analytics. This study investigates AI’s impact on data indexing, query optimization, and real-time processing, which reduces latency and enhances system responsiveness. It also examines AI applications in data cleaning and deduplication, which contribute to improved data quality and consistency. By integrating AI into database operations, organizations can gain actionable insights from streaming data and support data-driven decision-making. Moreover, the thesis considers ethical and privacy concerns in AI-based data systems, emphasizing the importance of robust governance and transparency. Through theoretical insights and practical case studies, this research shows that AI significantly boosts operational efficiency and unlocks new possibilities in big data environments. The synergy between AI and database technologies is essential for the future of scalable, intelligent data management. Keywords- Database migration, schema mapping, SQL dialect translation, query optimization, artificial intelligence, DB Genie

  • Research Article
  • 10.5194/isprs-archives-xlviii-4-w13-2025-157-2025
Open Technologies Supporting Linked Open Data Publishing: Croatian Population Census Case Study
  • Jul 11, 2025
  • The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
  • Karlo Kević + 1 more

Abstract. Population census data in Croatia is provided in spreadsheet and without explicit geometric representation of its spatial units which makes it not directly usable in spatial analyses and data-based decision-making. This poses challenges for data interoperability and limits practical usefulness of census data. To overcome these limitations, this paper aims to propose ontology mapping schema to structure population contingents by age and sex in RDF and publish it as Linked Open Data using open-source data wrangling tool, OpenRefine. In line with Linked Data best practices, population data and spatial units’ geometries were modelled separately and linked through spatial unit’s URIs. Also, to ensure interoperability and enable broader integration, the schema reuses semantics from RDF Data Cube Vocabulary and GeoSPAQRL while the use of open technologies ensures that the resulting RDF triples are reproducible. This proposed ontology mapping schema represents a foundational step towards the publication of linked open census data in Croatia, paving the way for improved integration and reuse in the future.

  • Research Article
  • 10.32996/jcsts.2025.7.7.14
Revolutionizing Data Warehouse Migration with Multi-Cloud Computing
  • Jul 2, 2025
  • Journal of Computer Science and Technology Studies
  • Achyut Kumar Sharma Tandra

This article explores the transformative potential of Multi-Cloud Computing (MCC) in revolutionizing data warehouse migration strategies across heterogeneous environments. MCC architectures enable seamless data movement from diverse source systems including relational databases, NoSQL repositories, and streaming platforms into modern cloud data warehouses while optimizing resources across multiple cloud providers. The article examines comprehensive aspects of MCC implementation, from initial data source integration through destination warehouse optimization, AI-driven workload management, and sophisticated governance frameworks. By abstracting underlying infrastructure differences between cloud platforms, MCC creates unified control planes that intelligently route data processing based on performance, cost, and compliance requirements rather than provider limitations. The article demonstrates how distributed processing engines working across cloud boundaries achieve substantial improvements in migration performance while reducing source system impact and operational costs. Advanced capabilities, including automated schema mapping, intelligent transformation optimization, comprehensive lineage tracking, and cross-cloud security frameworks, collectively address traditional migration challenges that have historically imposed significant risk and cost on organizations. This article establishes MCC as not merely a technical architecture but a strategic approach enabling data agility in increasingly complex multi-cloud environments, ultimately transforming how organizations conceptualize and implement data warehouse migrations.

  • Research Article
  • 10.30574/wjaets.2025.15.3.1103
The transformation of ETL processes through Artificial Intelligence
  • Jun 30, 2025
  • World Journal of Advanced Engineering Technology and Sciences
  • Murali Krishna Santhuluri Venkata

Artificial intelligence has fundamentally transformed Extract, Transform, Load (ETL) processes across enterprise environments, revolutionizing traditional data integration practices. Conventional ETL methodologies have historically suffered from labor-intensive manual coding, complex data mapping requirements, and inflexible rule-based architectures, creating bottlenecks in terms of scalability, efficiency, and adaptability. The emergence of AI-enhanced ETL technologies represents a paradigm shift, introducing unprecedented levels of automation and intelligence throughout the data integration lifecycle. Key capabilities include automated schema mapping through semantic analysis and pattern recognition algorithms, intelligent data quality management with real-time anomaly detection, cognitive data classification for sensitive information, and natural language interfaces democratizing access to ETL functionality. Implementation examples across Microsoft Azure environments demonstrate substantial improvements in all ETL phases, while applications in financial services, healthcare, and retail illustrate tangible business value. Looking forward, emerging trends such as autonomous self-configuring pipelines, explainable AI mechanisms, edge-based processing architectures, federated learning frameworks, and quantum-enhanced transformations promise to further revolutionize data integration practices. This technological evolution enables organizations to process increasingly complex data landscapes with enhanced efficiency, accuracy, and agility while reducing operational overhead

  • Research Article
  • 10.30574/wjaets.2025.15.3.1061
Accelerating digital transformation: AI-driven frameworks for legacy-to-cloud data modernization
  • Jun 30, 2025
  • World Journal of Advanced Engineering Technology and Sciences
  • Rakshit Khare

This article presents a comprehensive framework for automating the migration of legacy data systems to cloud platforms through an AI-driven approach. It addresses the critical balance between risk mitigation, cost management, and operational continuity throughout the modernization journey. By leveraging advanced machine learning algorithms for schema discovery, automated code generation, performance optimization, and continuous validation, organizations can significantly reduce manual efforts while accelerating migration timelines. The framework incorporates intelligent scanning of diverse source systems, automated schema mapping to cloud warehouses, machine learning-based performance tuning, robust validation mechanisms, and infrastructure provisioning through Infrastructure as Code. This systematic approach enables enterprises to confidently transition from legacy platforms to cloud-native analytics ecosystems while maintaining data fidelity and minimizing business disruption.

  • Research Article
  • 10.1038/s41598-025-06447-2
Evaluating language model embeddings for Parkinson’s disease cohort harmonization using a novel manually curated variable mapping schema
  • Jun 20, 2025
  • Scientific Reports
  • Yasamin Salimi + 5 more

Data Harmonization is an important yet time-consuming process. With the recent popularity of applications using Language Models (LMs) due to their high capabilities in text understanding, we investigated whether LMs could facilitate data harmonization for clinical use cases. To evaluate this, we created PASSIONATE, a novel Parkinson’s disease (PD) variable mapping schema as a ground truth source for pairwise cohort harmonization using LLMs. Additionally, we extended our investigation using an existing Alzheimer’s disease (AD) CDM. We computed text embeddings based on two language models to perform automated cohort harmonization for both AD and PD. We additionally compared the results to a baseline method using fuzzy string matching to determine the degree to which the semantic capabilities of language models can be utilized for automated cohort harmonization. We found that mappings based on text embeddings performed significantly better than those generated by fuzzy string matching, reaching an average accuracy of over 80% for almost all tested PD cohorts. When extended to a further neighborhood of possible matches, the accuracy could be improved to up to 96%. Our results suggest that language models can be used for automated harmonization with a high accuracy that can potentially be improved in the future by applying domain-trained models.

  • Research Article
  • 10.3390/fi17060245
Semantic Fusion of Health Data: Implementing a Federated Virtualized Knowledge Graph Framework Leveraging Ontop System
  • May 30, 2025
  • Future Internet
  • Abid Ali Fareedi + 3 more

Data integration (DI) and semantic interoperability (SI) are critical in healthcare, enabling seamless, patient-centric data sharing across systems to meet the demand for instant, unambiguous access to health information. Federated information systems (FIS) highlight auspicious issues for seamless DI and SI stemming from diverse data sources or models. We present a hybrid ontology-based design science research engineering (ODSRE) methodology that combines design science activities with ontology engineering principles to address the above-mentioned issues. The ODSRE constructs a systematic mechanism leveraging the Ontop virtual paradigm to establish a state-of-the-art federated virtual knowledge graph framework (FVKG) embedded virtualized knowledge graph approach to mitigate the aforementioned challenges effectively. The proposed FVKG helps construct a virtualized data federation leveraging the Ontop semantic query engine that effectively resolves data bottlenecks. Using a virtualized technique, the FVKG helps to reduce data migration, ensures low latency and dynamic freshness, and facilitates real-time access while upholding integrity and coherence throughout the federation system. As a result, we suggest a customized framework for constructing ontological monolithic semantic artifacts, especially in FIS. The proposed FVKG incorporates ontology-based data access (OBDA) to build a monolithic virtualized repository that integrates various ontological-driven artifacts and ensures semantic alignments using schema mapping techniques.

  • Research Article
  • 10.14434/ijes.v7i1.41098
Quaternary Geology of the Indiana Portion of the Southern Half of the Kankakee 30- x 60-minute Quadrangle
  • May 30, 2025
  • Indiana Journal of Earth Sciences
  • Henry Munro Loope + 1 more

The map of the Quaternary Geology of the Indiana Portion of the Southern Half of the Kankakee 30- x 60-minute Quadrangle displays unconsolidated Pleistocene glacial sediments associated with the Lake Michigan Lobe and Huron-Erie Lobe of the Laurentide Ice Sheet and post-glacial sediments deposited by eolian, fluvial, and lacustrine processes in northwestern Indiana. Glacial and proglacial deposits include diamicton, glaciofluvial, and glaciolacustrine sediments deposited during the Wisconsin Episode glaciation. Non-glacial deposits include eolian, alluvial, paludal, and lacustrine sediments deposited during the late Wisconsin Episode and Holocene. Silurian and Devonian bedrock directly underlie the late Wisconsin Episode glacial sediments and non-glacial sediments deposited during the Holocene. Unconsolidated deposits were characterized through field observations; new and archived borehole data; lithologic information from the Indiana Department of Natural Resources water well database; and soils data from the U.S. Department of Agriculture, Natural Resource Conservation Service, Soil Survey Geographic (SSURGO) database. A light detection and ranging (LiDAR)-based digital elevation model was used in combination with geologic data to identify landforms and infer contacts between unconsolidated units. Summary descriptions of mapped units are listed on the map sheet with detailed descriptions in the accompanying pamphlet. In addition to the map and pamphlet, a composite spatial data set that conforms to the standardized database schema known as GeMS (Geologic Map Schema) is also available for download. Metadata records associated with each element within the spatial data set contain detailed descriptions of their purpose, constituent entities, and attributes. This geologic map was funded in part through the U.S. Geological Survey Great Lakes Geologic Mapping Coalition program under Cooperative Agreement No. G22AC00550.

  • Research Article
  • 10.30574/wjaets.2025.15.1.0494
Data normalization and synchronization challenges in multi-cloud ERP systems
  • Apr 30, 2025
  • World Journal of Advanced Engineering Technology and Sciences
  • Srinivasan Pakkirisamy

Enterprise Resource Planning (ERP) systems have evolved from monolithic architectures toward highly distributed, cloud-native implementations. Organizations increasingly adopt multi-cloud strategies that distribute business processes across specialized platforms: Oracle Cloud ERP for financials, Workday for human capital management, Salesforce for customer relationship management, and Blue Yonder for supply chain optimization. This strategic approach delivers superior domain-specific functionality but introduces significant integration challenges. Divergent data models, inconsistent APIs, and varying transaction semantics across these platforms create substantial barriers to seamless information exchange. Data normalization demands the reconciliation of fundamentally different architectural philosophies, from Oracle's complex financial structures to Workday's comprehensive employee objects. Synchronization faces critical obstacles, including transaction propagation delays, eventual consistency models, and high-volume event streams. This article explores these challenges and presents a comprehensive framework for addressing them through structured approaches to schema mapping, semantic reconciliation, conflict resolution, and security enforcement. By implementing these strategies, enterprises can achieve cohesive operations while maintaining the specialized capabilities of their cloud platform ecosystem.

  • Research Article
  • 10.71097/ijsat.v16.i1.2517
The Transformative Impact of AI on Enterprise Cloud Integrations and Automation
  • Mar 28, 2025
  • International Journal on Science and Technology
  • Mahesh Kolli -

The transformative impact of artificial intelligence on enterprise cloud integrations and automation is reshaping how organizations manage data, workflows, and security in distributed environments. AI-driven solutions are evolving traditional integration approaches into intelligent, adaptive frameworks that deliver significant advantages across multiple dimensions. These advances facilitate intelligent data integration with automated schema mapping and predictive quality management, enable workflow optimization through self-optimizing processes and digital twins, strengthen security and compliance with behavioral analytics and continuous monitoring, and support specialized infrastructure requirements for compute, storage, and networking. Enterprises implementing these technologies report substantial operational efficiency improvements, reduced costs, accelerated time-to-market, and enhanced customer experiences. Despite implementation challenges related to data quality, legacy systems, skills gaps, and change management, organizations following established best practices can achieve remarkable business transformation while positioning themselves for sustainable competitive advantage in an evolving technological landscape.

  • Open Access Icon
  • Research Article
  • 10.14434/ijes.v7i1.36655
Quaternary Geology of the Indiana Portions of the Chicago and the Northern Half of the Kankakee 30- x 60-minute Quadrangles
  • Mar 19, 2025
  • Indiana Journal of Earth Sciences
  • Henry Munro Loope + 1 more

The map of the Quaternary Geology of the Indiana portions of the Chicago 30- x 60-minute quadrangle and the northern half of the Kankakee 30- x 60-minute quadrangle displays unconsolidated Pleistocene glacial sediments deposited by the Lake Michigan Lobe of the Laurentide Ice Sheet and Holocene post-glacial sediments deposited by coastal, eolian, and fluvial processes in northwestern Indiana. Unconsolidated deposits were characterized through field observations; new and archived borehole data; lithologic information from the Indiana Department of Natural Resources water well database; and soils data from the U.S. Department of Agriculture, Natural Resource Conservation Service, Soil Survey Geographic (SSURGO) database. A light detection and ranging (LiDAR)-based digital elevation model was used in combination with geologic data to identify landforms and infer contacts between unconsolidated units. Glacial and proglacial deposits include diamicton, glaciofluvial, glaciolacustrine, and eolian sediments of Wisconsin Age which directly overlie Devonian and Silurian bedrock. Summary descriptions of mapped units are listed on the map sheet with detailed descriptions in the accompanying pamphlet. In addition to the map and pamphlet, a composite spatial data set that conforms to the standardized database schema known as GeMS (Geologic Map Schema) is also available for download. Metadata records associated with each element within the spatial data set contain detailed descriptions of their purpose, constituent entities, and attributes. This geologic map was funded in part through the U.S. Geological Survey Great Lakes Geologic Mapping Coalition program under Cooperative Agreement No. G21AC10762.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3390/math13040607
SNMatch: An Unsupervised Method for Column Semantic-Type Detection Based on Siamese Network
  • Feb 13, 2025
  • Mathematics
  • Tiezheng Nie + 5 more

Column semantic-type detection is a crucial task for data integration and schema matching, particularly when dealing with large volumes of unlabeled tabular data. Existing methods often rely on supervised learning models, which require extensive labeled data. In this paper, we propose SNMatch, an unsupervised approach based on a Siamese network for detecting column semantic types without labeled training data. The novelty of SNMatch lies in its ability to generate the semantic embeddings of columns by considering both format and semantic features and clustering them into semantic types. Unlike traditional methods, which typically rely on keyword matching or supervised classification, SNMatch leverages unsupervised learning to tackle the challenges of column semantic detection in massive datasets with limited labeled examples. We demonstrate that SNMatch significantly outperforms current state-of-the-art techniques in terms of clustering accuracy, especially in handling complex and nested semantic types. Extensive experiments on the MACST and VizNet-Manyeyes datasets validate its effectiveness, achieving superior performance in column semantic-type detection compared to methods such as TF-IDF, FastText, and BERT. The proposed method shows great promise for practical applications in data integration, data cleaning, and automated schema mapping, particularly in scenarios where labeled data are scarce or unavailable. Furthermore, our work builds upon recent advances in neural network-based embeddings and unsupervised learning, contributing to the growing body of research in automatic schema matching and tabular data understanding.

  • Open Access Icon
  • Research Article
  • 10.32628/cseit251112148
Using AI to Transform Modern Data Platforms: Bridging the Gap between Data and Business Users
  • Jan 31, 2025
  • International Journal of Scientific Research in Computer Science, Engineering and Information Technology
  • Ashrith Reddy Mekala

The integration of artificial intelligence in modern data platforms has fundamentally transformed how organizations interact with their data assets. This transformation encompasses several key innovations: natural language interfaces that enable direct SQL query generation, AI-powered business catalogs that automate metadata management, and conversational analytics systems that facilitate intuitive data exploration. These advancements have democratized data access across organizational hierarchies, reducing dependency on specialized technical teams while enhancing operational efficiency. The evolution from traditional rule-based systems to sophisticated neural network architectures has enabled more accurate query processing, improved schema mapping, and context-aware interactions. Additionally, the implementation of active metadata management and automated governance frameworks has strengthened data quality and compliance measures. As these technologies continue to mature, organizations face both opportunities and challenges in scaling their AI implementations while maintaining security, privacy, and model explainability.

  • Open Access Icon
  • Research Article
  • 10.1093/jamiaopen/ooae157
Evaluating dimensionality reduction of comorbidities for predictive modeling in individuals with neurofibromatosis type 1.
  • Dec 26, 2024
  • JAMIA open
  • Aditi Gupta + 7 more

Dimensionality reduction techniques aim to enhance the performance of machine learning (ML) models by reducing noise and mitigating overfitting. We sought to compare the effect of different dimensionality reduction methods for comorbidity features extracted from electronic health records (EHRs) on the performance of ML models for predicting the development of various sub-phenotypes in children with Neurofibromatosis type 1 (NF1). EHR-derived data from pediatric subjects with a confirmed clinical diagnosis of NF1 were used to create 10 unique comorbidities code-derived feature sets by incorporating dimensionality reduction techniques using raw International Classification of Diseases codes, Clinical Classifications Software Refined, and Phecode mapping schemes. We compared the performance of logistic regression, XGBoost, and random forest models utilizing each feature set. XGBoost-based predictive models were most successful at predicting NF1 sub-phenotypes. Overall, features based on domain knowledge-informed mapping schema performed better than unsupervised feature reduction methods. High-level features exhibited the worst performance across models and outcomes, suggesting excessive information loss with over-aggregation of features. Model performance is significantly impacted by dimensionality reduction techniques and varies by specific ML algorithm and outcome being predicted. Automated methods using existing knowledge and ontology databases can effectively aggregate features extracted from EHRs. Dimensionality reduction through feature aggregation can enhance the performance of ML models, particularly in high-dimensional datasets with small sample sizes, commonly found in EHRs health applications. However, if not carefully optimized, it can lead to information loss and data oversimplification, potentially adversely affecting model performance.

  • Open Access Icon
  • Research Article
  • 10.14434/ijes.v6i1.38013
Quaternary Geology of the Bloomington 30- x 60-minute Quadrangle, Indiana
  • Aug 29, 2024
  • Indiana Journal of Earth Sciences
  • Henry Munro Loope + 4 more

The Quaternary Geology of the Bloomington 30- x 60-minute Quadrangle displays unconsolidated sediments deposited near the southern limit of multiple glaciations at the northern end of the Crawford Upland, Mitchell Plateau, and Norman Upland physiographic provinces in south-central Indiana. Glacial and proglacial deposits (diamicton, glaciofluvial, glaciolacustrine, and eolian sediments) of varying thickness associated with the Wisconsin, Illinois, and pre-Illinois Age glaciations overlie Paleozoic bedrock. Unconsolidated deposits were characterized through field observations, new and archived borehole data, lithologic information from the Indiana Department of Natural Resources water well database, and soils data from the U.S. Department of Agriculture, Natural Resource Conservation Service, Soil Survey Geographic (SSURGO) database. A light detection and ranging-based digital elevation model was used in combination with geologic data to identify landforms and infer contacts between unconsolidated units. Summary descriptions of mapped units are listed on the map sheet with more detailed descriptions in the accompanying pamphlet. Geochronologic (luminescence and radiocarbon ages) and borehole data are also found in the pamphlet. In addition to the map and pamphlet, a composite spatial data set that conforms to the standardized database schema known as GeMS (Geologic Map Schema) is also available for download. Metadata records associated with each element within the spatial data set contain detailed descriptions of their purpose, constituent entities, and attributes. This geologic map was funded in part through the STATEMAP program supported by the U.S. Geological Survey under Cooperative Agreement No. G22AC00424-00.

  • Research Article
  • Cite Count Icon 2
  • 10.56294/dm2024219
Overview on Data Ingestion and Schema Matching
  • Aug 2, 2024
  • Data and Metadata
  • Oumaima El Haddadi + 5 more

This overview traced the evolution of data management, transitioning from traditional ETL processes to addressing contemporary challenges in Big Data, with a particular emphasis on data ingestion and schema matching. It explored the classification of data ingestion into batch, real-time, and hybrid processing, underscoring the challenges associated with data quality and heterogeneity. Central to the discussion was the role of schema mapping in data alignment, proving indispensable for linking diverse data sources. Recent advancements, notably the adoption of machine learning techniques, were significantly reshaping the landscape. The paper also addressed current challenges, including the integration of new technologies and the necessity for effective schema matching solutions, highlighting the continuously evolving nature of schema matching in the context of Big Data

  • Open Access Icon
  • Research Article
  • 10.14434/ijes.v6i1.37575
Geologic Map of the Indiana Portions of the 30- by 60-minute Jasper and Tell City Quadrangles
  • Jul 24, 2024
  • Indiana Journal of Earth Sciences
  • Don Tripp + 5 more

The Geologic Map of the Indiana Portions of the 30- x 60-minute Jasper and Tell City Quadrangles displays the Mississippian and Pennsylvanian bedrock and the overlying Quaternary units distributed over five physiographic provinces and eight counties within south-central Indiana. Bedrock units and unconsolidated deposits were characterized by outcrop field observations and new and archived borehole data. Bedrock data was consolidated into a single mapping database, which was then processed using gridding and contouring tools in ArcGIS Pro, resulting in rock formation contact lines, mappable at 1:100,000 scale. Additional information used to delineate contacts for unconsolidated deposits was based on the Indiana Department of Natural Resources water well database and soils data from the Soil Survey Geographic (SSURGO) database. Light detection and ranging (LiDAR) digital elevation models were also used to identify landforms and refine contacts between surficial units. The mapped bedrock units, from east to west, range in age from the Borden Group (lower Mississippian) to the Raccoon Creek Group (Lower Pennsylvanian) with the strike oriented generally from northwest to southeast. Quaternary sediments were mostly deposited during the last two major advances of the Laurentide Ice Sheet into south-central Indiana and are mostly found along valleys in major rivers and streams. Summary descriptions of mapped units are listed on the map sheet with more detailed descriptions in an accompanying pamphlet. Survey drill hole data points are also displayed on the map with corresponding links to expanded information for each of the drill holes listed in the pamphlet. In addition to the map and pamphlet, a composite spatial data set that conforms to the standardized database schema known as GeMS (Geologic Map Schema) is also available for download. Both ESRI-licensed geodatabase and open shapefile versions of the spatial data set are available. Metadata records associated with each element within the spatial data set contain detailed descriptions of their purpose, constituent entities, and attributes. This geologic map was funded in part through the STATEMAP program supported by the U.S. Geological Survey under Cooperative Agreement No. G22AC00424-00.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2025 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers