• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Related Topics

  • Scientific Data Processing
  • Scientific Data Processing

Articles published on Data Science

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
37259 Search results
Sort by
Recency
  • New
  • Research Article
  • 10.1038/s41597-026-06553-4
Data science academic programs in the pre-ChatGPT erain the Midwestern United States: a curated dataset.
  • Jan 17, 2026
  • Scientific data
  • Danielle Blackford + 1 more

This dataset documents and classifies academic programs in data science offered by higher education institutions in the Midwestern United States prior to widespread adoption of generative artificial intelligence tools (up to 2023). Detailed program-level metadata, including program names and types, institutional types, degree levels, and state-by-state information are systematically compiled using a reproducible typology. The resource enables investigation of regional trends, curriculum benchmarking, and comparative analysis of data science education in the pre-generative-AI era. By providing a standardized and transparent overview of academic offerings, the dataset supports research, educational planning, and policy decisions related to workforce development and the evolving landscape of data science education.

  • New
  • Research Article
  • 10.1016/j.jss.2025.12.037
Expanding the Methodology Toolkit for Surgical Research in a Data-Rich Era.
  • Jan 17, 2026
  • The Journal of surgical research
  • Kanhua Yin + 2 more

Expanding the Methodology Toolkit for Surgical Research in a Data-Rich Era.

  • New
  • Research Article
  • 10.1177/23998083251415040
Spanishoddata: A package for accessing and working with Spanish Open Mobility Big Data
  • Jan 17, 2026
  • Environment and Planning B: Urban Analytics and City Science
  • Egor Kotov + 6 more

We present spanishoddata , an R package that enables fast and efficient access to Spain’s open, high-resolution origin-destination human mobility datasets, derived from anonymised mobile-phone records and released by the Ministry of Transport and Sustainable Mobility. The package directly addresses challenges of data accessibility, reproducibility, and efficient processing identified in prior studies. spanishoddata automates retrieval from the official source, performs file and schema validation, and converts the data to efficient, analysis-ready formats ( DuckDB and Parquet ) that enable multi-month and multi-year analysis on consumer-grade hardware. The interface handles complexities associated with these datasets, enabling a wide range of people – from data science beginners to experienced practitioners with domain expertise – to start using the data with just a few lines of code. We demonstrate the utility of the package with example applications in urban transport planning, such as assessing cycling potential or understanding mobility patterns by activity type. By simplifying data access and promoting reproducible workflows, spanishoddata lowers the barrier to entry for researchers, policymakers, transport planners or anyone seeking to leverage mobility datasets.

  • New
  • Research Article
  • 10.1108/ils-04-2025-0066
Why data matters to me: exploring personal data relevance in a data-art inquiry program
  • Jan 16, 2026
  • Information and Learning Sciences
  • Yilang Zhao + 1 more

Purpose This study explores how youth establish personal relevance with data within a data-art inquiry program. To be specific, this study aims to use the Personal Data Relevance (PDR) framework, adapted from Priniski et al.’s (2018) personally meaningful learning framework, to examine how youth’s engagement with data topics, data sets and data products contributes to meaningful data science learning. Design/methodology/approach The PDR framework is the main framework for understanding personal data relevance in this study. The authors implemented a data-art inquiry program with 16 high school participants in a rural high school setting in East Tennessee. The data included youth-generated data visualizations and transcripts from group interviews, and the analysis involved a qualitative approach combining deductive and inductive coding. Findings Youth’s personal connections with data are through three key dimensions of the PDR framework: Personal Data Association, Personal Data Usefulness and Personal Data Identification. The findings reveal that participants were most engaged with data topics reflecting personal experiences, that they were able to develop situational interest in data in the program, and that their final data visualizations provided a medium for expressing social values and identity. Research limitations/implications This study’s findings may be context-specific due to the structured sequential nature of the data-art inquiry program. Future research could explore the PDR framework’s applicability in varied instructional designs and investigate interactions among the PDR dimensions more deeply. Practical implications Educators designing data science curricula should explicitly incorporate opportunities for youth to select personally relevant data topics, actively engage in data set exploration and reflect on the social implications of their data products to enhance data engagement and data science learning. Social implications Encouraging youth to find personal relevance in data can foster deeper engagement with data-based societal issues, which can promote informed and active participation in public discourse through data literacy. Originality/value This study introduces the PDR framework, providing a structured approach for analyzing and designing data science programs that emphasize youth’s personally meaningful connections with data. This study contributes uniquely to the field by explicitly linking personal relevance to interdisciplinary data-art inquiry contexts.

  • New
  • Research Article
  • 10.3389/fdata.2025.1723155
Big data approaches to bovine bioacoustics: a FAIR-compliant dataset and scalable ML framework for precision livestock welfare
  • Jan 16, 2026
  • Frontiers in Big Data
  • Mayuri Kate + 1 more

The convergence of IoT sensing, edge computing, and machine learning is revolutionizing precision livestock farming. Yet bioacoustic data streams remain underexploited due to computational-complexity and ecological-validity challenges. We present one of the most comprehensive bovine vocalization datasets to date-569 expertly curated clips spanning 48 behavioral classes, recorded across three commercial dairy farms using multi-microphone arrays and expanded to 2,900 samples through domain-informed data augmentation. This FAIR-compliant resource addresses key Big Data challenges: volume (90 h of raw recordings, 65.6 GB), variety (multi-farm, multi-zone acoustic environments), velocity (real-time processing requirements), and veracity (noise-robust feature-extraction pipelines). A modular data-processing workflow combines denoising implemented both in iZotope RX 11 for quality control and an equivalent open-source Python pipeline using noisereduce, multi-modal synchronization (audio-video alignment), and standardized feature engineering (24 acoustic descriptors via Praat, librosa, and openSMILE) to enable scalable welfare monitoring. Preliminary machine-learning benchmarks reveal distinct class-wise acoustic signatures across estrus detection, distress classification, and maternal-communication recognition. The dataset's ecological realism-embracing authentic barn acoustics rather than controlled conditions-ensures deployment-ready model development. This work establishes the foundation for animal-centered AI, where bioacoustic streams enable continuous, non-invasive welfare assessment at industrial scale. By releasing a Zenodo-hosted, FAIR-compliant dataset (restricted access) and an open-source preprocessing pipeline on GitHub, together with comprehensive metadata schemas, we advance reproducible research at the intersection of Big Data analytics, sustainable agriculture, and precision livestock management. The framework directly supports UN SDG 9, demonstrating how data science can transform traditional farming into intelligent, welfare-optimized production systems capable of meeting global food demands while maintaining ethical animal-care standards.

  • New
  • Research Article
  • 10.1007/s13197-025-06528-0
Automated monitoring of alcoholic fermentation: trends and challenges
  • Jan 16, 2026
  • Journal of Food Science and Technology
  • Tomáš Horváth + 3 more

Abstract The progress of fermentation, an important step in spirit production, needs to be monitored regularly to detect possible faults. Automated monitoring of fermentation, however, is often limited to only a few parameters of the mash such as, and mainly, its temperature. With the advance of sensor technology and data analytics, various solutions to automated fermentation monitoring emerged, mainly for the beer and wine industry, however, these are not yet critically evaluated and compared. Thus, scientific articles on automated monitoring of alcoholic fermentation are reviewed and evaluated here according to the type of sensors used, the type of fermented material, and the reproducibility and feasibility of the presented solutions. Possible data analytics methods to utilize are introduced and their pros and cons are discussed. A critical evaluation from scientific and industrial perspectives is provided with prospects for the distilling industry where mashes of various states of matter, inhomogeneity and viscosity can appear. Key findings and conclusions of this review are: Electronic nose and electronic tongue biosensors are a promising direction in the area. A publicly available database on recorded data from e-nose and e-tongue as well as other sensors on fermentation monitoring is needed but still missing. Current solutions on automated fermentation monitoring are rather isolated studies, conducted in laboratories, yet to be evaluated and tested in industrial environments. The use of machine learning techniques in these studies, in general, does not comply with the well-established standards in data science and artificial intelligence.

  • New
  • Research Article
  • 10.3390/ani16020282
The Spatial and Temporal Distribution of Bigeye Tuna and Yellowfin Tuna in the Northwest Indian Ocean and Their Relationship with Environmental Factors
  • Jan 16, 2026
  • Animals
  • Guoqing Zhao + 9 more

The Northwestern Indian Ocean (NWIO) serves as a primary fishing ground for tuna longline fisheries, with bigeye tuna (Thunnus obesus) and yellowfin tuna (Thunnus albacares) constituting the main target species. Investigating their spatiotemporal distribution and relationship with environmental factors is of significant importance for fishery management and fishing. This study analyzed and compared the distribution patterns and environmental preferences of these two species across different depth layers, based on fisheries scientific survey data collected during the 2023/2024 and 2024/2025 fishing seasons. Key findings include: The hook rate in 2023/2024 was higher than in 2024/2025, and the hook rate for T. obesus exceeded that of T. albacares. T. obesus were predominantly concentrated within 63° E–69° E and 7° N–9° N, while T. albacares exhibited a broader yet more dispersed distribution range. T. obesus primarily occupied depth layers of 130–140 m (12.20%), 180–190 m (9.76%), and 270–280 m (9.76%). T. albacares were mainly found at 110–120 m (15%), 140–150 m (15%), and 200–210 m (15%). Both species exhibit distinct spatial clustering patterns, and their hotspot distribution areas are, respectively, 63° E–69° E, 5° N–10° N and 64° E–68° E, 0° N–4° N. Correlation analysis revealed significant relationships between T. obesus distribution and latitude, zooplankton abundance, water temperature at various depths, and chlorophyll a concentration. Our research provides reference for understanding the distribution of T. obesus and T. albacares across different water layers and their habitat preferences, laying a scientific foundation for achieving sustainable utilization of both species.

  • New
  • Research Article
  • 10.70609/g-tech.v10i1.8832
Analysis of the Accuracy and Completeness of SINTA Author Data Extraction
  • Jan 16, 2026
  • G-Tech: Jurnal Teknologi Terapan
  • Muhammad Arfah Asis + 2 more

The advancement of information technology has increased the use of web scraping for scientific data collection, including from the SINTA (Science and Technology Index) platform, which provides researcher profiles, affiliations, publications, and citation data. However, scraping SINTA poses challenges, particularly when multiple authors share identical scores that trigger changes in display order. This instability can lead to duplicated or missing entries when using a single-pass scraping approach. This study evaluates the accuracy and completeness of SINTA author data collection by implementing repeated scraping as a strategy to handle dynamic data ordering. Experiments were conducted on the Universitas Muslim Indonesia (UMI) affiliation, targeting 915 active authors. The methodology involved page-structure analysis, spider development using Python and Scrapy, sequential scraping through pagination, and validation of data completeness and uniqueness. A three-second delay between requests was applied to maintain responsible scraping practices. The results show that a single scraping attempt failed to retrieve all authors, capturing an average of only 877.2 authors (95.86%). Due to unstable ordering, repeated iterations were required. Through 4–8 scraping cycles per trial, all 915 authors were successfully collected without duplication. These findings indicate that for platforms with dynamic data structures such as SINTA, repeated scraping provides a more reliable method for ensuring data completeness and accuracy, supporting the development of stable and responsible publication-data automation systems.

  • New
  • Research Article
  • 10.1080/26939169.2026.2618201
Data Science Education in U.S. Informal Learning Environments: A Review of the Literature
  • Jan 16, 2026
  • Journal of Statistics and Data Science Education
  • Marc T Sager + 2 more

This systematic review examines data science education in U.S. informal learning environments through analysis of 20 studies. Our analysis reveals the landscape of this emerging field. Our findings highlight three critical dimensions shaping informal DSE: the methodological and theoretical diversity of the field; the interplay of people, practices, and places; and emerging design principles and pedagogical approaches. Effective programs integrate technical skills with critical perspectives, connect to personally meaningful contexts, and position learners as knowledge producers rather than consumers. We identify several promising approaches, including critical data literacies, personal data exploration, data storytelling, embodied learning, and youth positioning as data agents. Yet, implementation challenges persist: literacy barriers, data complexity, equity gaps between intentions and practice, and limited assessment frameworks constrain the field's ability to scale these innovations. Despite intentional efforts, notable gaps remain in rural, early childhood, and disability contexts. Critical approaches examining power and representation show promise for marginalized communities. Beyond technical recommendations, we argue for reconceptualizing data literacy toward collective sovereignty and assessment frameworks valuing transformative outcomes. Informal learning environments can serve not merely as preparation for existing data systems but as spaces for imagining and enacting more just alternatives that challenge power structures in data science education.

  • New
  • Research Article
  • 10.5194/essd-18-443-2026
Energy-conservation datasets of global land surface radiation and heat fluxes from 2000–2020 generated by CoSEB
  • Jan 16, 2026
  • Earth System Science Data
  • Junrui Wang + 3 more

Abstract. Accurately estimating global land surface radiation [including downward shortwave radiation (SWIN), downward longwave radiation (LWIN), upward shortwave radiation (SWOUT), upward longwave radiation (LWOUT) and net radiation (Rn)] and heat fluxes [including latent heat flux (LE), soil heat flux (G) and sensible heat flux (H)] is essential for quantifying the exchange of radiation, heat and water between the land and atmosphere under global climate change. This study presents the first data-driven energy-conservation datasets of global land surface radiation and heat fluxes from 2000 to 2020, generated by our model of Coordinated estimates of land Surface Energy Balance components (CoSEB). The model integrates GLASS and MODIS remote sensing data, ERA5-Land reanalysis datasets, topographic data, CO2 concentration data as independent variables and in situ radiation and heat flux observations at 258 eddy covariance sites worldwide as dependent variables within a multivariate random forest technique to effectively learn the physics of energy conservation. The developed CoSEB-based datasets are strikingly advantageous in that [1] they are the first data-driven global datasets that satisfy both surface radiation balance and heat balance among the eight fluxes, as demonstrated by both the radiation imbalance ratio [RIR, defined as 100%×(SWIN-SWOUT+LWIN-LWOUT-Rn)/Rn] and energy imbalance ratio [EIR, defined as 100%×(Rn-G-LE-H)/Rn] of 0, [2] the radiation and heat fluxes are characterized by high accuracies, where (1) the RMSEs (R2) for daily estimates of SWIN, SWOUT, LWIN, LWOUT, Rn, LE, H and G from the CoSEB-based datasets at 44 independent test sites were 37.52 W m−2 (0.81), 14.20 W m−2 (0.42), 22.47 W m−2 (0.90), 13.78 W m−2 (0.95), 29.66 W m−2 (0.77), 30.87 W m−2 (0.60), 29.75 W m−2 (0.44) and 5.69 W m−2 (0.44), respectively, (2) the CoSEB-based datasets, in comparison to the mainstream products/datasets (i.e. GLASS, BESS-Rad, BESSV2.0, FLUXCOM, MOD16A2, PML_V2 and ETMonitor) that generally separately estimated subsets of the eight flux components, better agreed with the in situ observations. Our developed datasets hold significant potential for application across diverse fields such as agriculture, forestry, hydrology, meteorology, ecology, and environmental science, which can facilitate comprehensive studies on the variability, impacts, responses, adaptation strategies, and mitigation measures of global and regional land surface radiation and heat fluxes under the influences of climate change and human activities. The CoSEB-based datasets are open access and available through the National Tibetan Plateau Data Center (TPDC) at https://doi.org/10.11888/Terre.tpdc.302559 (Tang et al., 2025a) and through the Science Data Bank (ScienceDB) at https://doi.org/10.57760/sciencedb.27228 (Tang et al., 2025b).

  • New
  • Research Article
  • 10.2105/ajph.2025.308351
Protecting Low-Wage Workers From Exploitation: A Mapping Study of Wage Theft Laws in the 40 Largest US Cities.
  • Jan 15, 2026
  • American journal of public health
  • Jennifer J Lee + 2 more

Objectives. We measured key features of local and state wage theft laws in the 40 largest US cities to assess the added value of local legislation and to create scientific legal data for use in evaluating the health impact of wage theft laws. Methods. We adapted standard policy surveillance methods to collect and code local and state minimum wage and nonpayment of wages theft laws from January 1, 2010, to April 15, 2023. Results. Compared with state laws, local wage theft legislation was proportionally more likely to contain features that facilitated worker complaints and to provide flexible enforcement tools. Only 4 of the 40 largest cities were totally preempted from enacting local wage theft legislation. Conclusions. Local wage theft laws provide an opportunity for innovative mechanisms to support complaint filing and enforcement. More cities could enact wage theft laws without preemption concerns. Public Health Implications. Ensuring that low-wage workers are fairly paid is important to health and health equity. Our research provides scientific legal data for use in evaluating the health effects of these widely applied protections. (Am J Public Health. Published online ahead of print January 15, 2026:e1-e10. https://doi.org/10.2105/AJPH.2025.308351).

  • New
  • Research Article
  • 10.1186/s12889-025-26155-w
Redefining social support: a scoping review of the effects of digital technologies on the social support of older workers.
  • Jan 14, 2026
  • BMC public health
  • Cristina Maria Tofan + 11 more

The rapid digitalisation of workplaces presents both challenges and opportunities for older workers. This scoping review examines how digital technologies impact social support for older workers, focusing on emotional, informational, and instrumental support within professional environments. While social support is critical for well-being and productivity in ageing workforces, the effects of digitalisation on social support dynamics remain insufficiently understood. Following Joanna Briggs Institute and PRISMA-ScR guidelines, a comprehensive search strategy was conducted across databases like ERIH, Web of Science, Scopus, and PubMed from anytime to 2023 to identify peer-reviewed studies involving digital technologies used by older workers, generally considered as workers aged 50 years or older. Covidence software facilitated the screening of over 5000 scientific papers, study selection, and data extraction, and the Mixed Methods Appraisal Tool (MMAT) assessed quality. Findings were synthesized through descriptive statistics and narrative analysis. Forty-three studies met inclusion criteria. Digital technologies were found to enhance various forms of social support: remote work tools, messaging apps, and telemedicine platforms facilitated emotional connection and informational exchange. However, digitalisation also introduced barriers, some older workers reported isolation, reduced informal contact, and technostress, underscoring disparities in digital literacy and adaptation. Digitalisation exerts a dual impact on social support for older workers: it can strengthen professional connectedness yet also heighten vulnerability to stress and exclusion. Targeted digital literacy initiatives and sustained managerial engagement are crucial to ensure that technology enhances, rather than undermines, well-being and productivity among ageing employees.

  • New
  • Research Article
  • 10.1038/s44220-025-00574-5
How data science competitions accelerate brain health discovery
  • Jan 14, 2026
  • Nature Mental Health
  • Arianna Zuanazzi + 2 more

How data science competitions accelerate brain health discovery

  • New
  • Research Article
  • 10.2196/73041
Machine Learning Ensemble Investigates Age in the Transcriptomic Response to Spaceflight in Murine Mammary Tissue: Observational Study
  • Jan 14, 2026
  • JMIRx Bio
  • James A Casaletto + 12 more

Abstract Background Spaceflight presents unique environmental stressors, such as microgravity and radiation, that significantly affect biological systems at the molecular, cellular, and organismal levels. Astronauts face an increased risk of developing cancer due to exposure to ionizing radiation and other spaceflight-related factors. Age plays a crucial role in the body’s response to the cellular stresses that lead to cancer, with younger organisms generally exhibiting more efficient response mechanisms than older ones. The vast majority of research investigating breast cancer risk from spaceflight uses cell lines exposed to simulated radiation and microgravity, but cell lines cannot capture the combinatorial response expressed across tissues, organs, and systems to real radiation and microgravity in space. Objective The primary objective of this in silico observational study is to characterize the molecular response to spaceflight of in vivo murine mammary tissue. We use an ensemble of linear binary classifiers to identify the molecular biomarkers enriched in this response using mice flown on the International Space Station. The secondary objective is to determine if age plays a role in this response. Methods The National Aeronautics and Space Administration (NASA) Open Science Data Repository has curated transcriptomic data obtained from 10 BALB/cAnNTac female mice flown on the International Space Station and 33 control mice kept on earth (OSD-511). In this observational study focused on two age groups (old/young), we used an ensemble of 4 machine learning binary classifiers with linear decision boundaries (logistic regression, support vector machine, stochastic gradient descent, and single-layer perceptron) to analyze gene expression profiles to predict age (old vs young) and condition (spaceflight vs ground control). Using the genes our ensemble identified as most predictive, we performed pathway enrichment analysis to investigate the molecular pathways involved in spaceflight-related health risks, particularly in the context of breast cancer. Results The pathway enrichment analyses revealed age-differentiated responses to spaceflight (false discovery rate–adjusted q values<.05). Among the 10 mice flown in space, younger mice exhibited significantly enriched pathways related to lipid metabolism and inflammatory stress signaling. All space-flown mice demonstrated evidence of adaptation in retinoid metabolism and peroxisome proliferator-activated receptor signaling in response to microgravity and radiation relative to their 33 ground control counterparts. Conclusions Spaceflight-induced breast cancer risk manifests through distinct age-specific mechanisms: younger individuals face risk through maladaptive metabolic hyperactivity and oxidative cycling, while older individuals are vulnerable due to impaired stress responses and accumulated metabolic dysfunction. Both age groups ultimately face elevated carcinogenic potential through different but converging pathways. These findings highlight the critical role of age in modulating the response to spaceflight-induced stress and suggest that these molecular pathways may contribute to differential outcomes in tissue homeostasis, metabolic disorders, and breast cancer susceptibility.

  • New
  • Research Article
  • 10.1038/s41698-025-01212-0
ShinyEvents: harmonizing longitudinal data for real-world survival estimation.
  • Jan 13, 2026
  • NPJ precision oncology
  • Alyssa Obermayer + 21 more

Longitudinal data analysis of the patient's treatment course is critical to uncovering variables that influence outcomes. However, existing tools have significant limitations in integrating multilayered time-series data, particularly in linking treatment events with survival outcomes. Here, we developed ShinyEvents, a web-based framework for complex longitudinal data analysis. ShinyEvents allows users to upload data and generate interactive timelines of clinical events, enabling cohort-level analyses such as treatment clustering and endpoint assignment. It also provides informative cohort visualizations, such as a Sankey diagram of the treatment line and a Swimmer diagram of the clinical course. Finally, our tool can infer real-world progression-free survival (rwPFS) based on user-defined endpoints and perform Kaplan-Meier and Cox proportional hazards regression analysis. With these features, the tool can then associate treatment lines with clinical outcomes. As a case study, we analyzed Moffitt patients with muscle-invasive bladder cancer treated with neoadjuvant chemotherapy followed by surgery. Patients treated with cisplatin and gemcitabine exhibited more favorable rwPFS and overall survival, which is consistent with prior reports. Altogether, ShinyEvents provides a unified framework for integrating longitudinal real-world data with survival analytics, fostering transparent and reproducible collaboration between clinicians and data scientists. A live demo is available at https://shawlab-moffitt.shinyapps.io/shinyevents/.

  • New
  • Research Article
  • 10.36001/phmap.2025.v5i1.4486
Cleaning Maintenance Logs with LLM Agents for Improved Predictive Maintenance
  • Jan 13, 2026
  • PHM Society Asia-Pacific Conference
  • Valeriu Ionut Dimidov + 3 more

Maintenance logs serve as the backbone of data-driven Predictive Maintenance (PdM) systems by providing information that can be used to create and label datasets for training survival analysis and machine learning (ML) models. However, due to personnel manually entering information into maintenance logs and the various levels of flexibility that maintenance tracking systems allow, service records often contain errors. Currently, the cleaning of equipment maintenance records is performed manually by experts such as data scientists or reliability engineers. Nevertheless, this task is time-consuming and often does not entirely eliminate noise from the data. In this paper, we propose using large language model (LLM)-based agents to automate the cleaning of maintenance logs. We provide an implementation that allows the agents to perform data cleaning as well as metrics to assess agents' performance. Finally, we compare the performance of several LLMs on this task. Our empirical results indicate that LLM-based agents are a promising solution for improving the quality of the datasets used in PdM systems and ultimately developing predictive maintenance models that are more reliable and useful.

  • New
  • Research Article
  • 10.71204/k816a227
Unveiling the Inaugural Issue of Digital-Intelligent Economy and Scientific Management
  • Jan 13, 2026
  • Digital-Intelligent Economy and Scientific Management
  • Sokolov B.I + 1 more

With immense pride and a profound sense of mission, we are delighted to present the inaugural issue of Digital-Intelligent Economy and Scientific Management (DIESM). This launch marks not only the birth of a new academic journal, but also an important milestone in advancing the integration of economics and management in the digital-intelligence era. DIESM is an international academic journal that rigorously adheres to a double-blind peer review process. It is dedicated to exploring how digital and intelligent technologies are reshaping economic systems, financial decision-making, and organizational management. The journal pays particular attention to the integration of cutting-edge technologies such as artificial intelligence, big data, and machine learning with financial systems, corporate governance, organizational practices, and public policy—endeavouring to bridge the gap between theoretical exploration and practical application. DIESM features a diverse collection of articles, ranging from monetary policy, capital markets, financial risk management, corporate governance, financial accounting and auditing, ESG and sustainable development, data science and intelligent decision-making. Together, these contributions reflect interdisciplinary and cross-sectoral academic perspective. We anticipate that these high-quality research outputs will provide fresh theoretical insights for the academic community, whilst also offering valuable reference for policymakers and corporate practitioners. At a time when technological innovation and economic transformation are accelerating worldwide, we believe that the exchange of ideas and dissemination of research findings are essential for the advancement of economics and management. DIESM aspires to be a leading international platform where scholars, researchers, policymakers, and practitioners from around the world come together to examine the opportunities and challenges of the digital-intelligence era. In today's world where global technology evolves rapidly and economic management models undergo constant innovation, we firmly believe that the exchange of ideas and the dissemination of academic achievements are crucial to the development of the discipline of economics and management. DIESM is dedicated to building a high-level international exchange platform, bringing together scholars, researchers, and practical experts from around the world to jointly explore the opportunities and challenges presented by the digital-Intelligent wave. We are committed to upholding academic integrity, prioritizing quality, and pursuing excellence at all times. We sincerely welcome scholars worldwide to actively submit their manuscripts, and work with us to advance the theoretical innovation and practical development of the economics and management in the digital-Intelligent era. Welcome to the Digital-Intelligent Economy and Scientific Management. We hope this inaugural issue inspires your research and practice, and we look forward to your long-term attention and support.

  • New
  • Research Article
  • 10.3390/signals7010008
Firebug Swarm Optimization Algorithm: An Overview and Applications
  • Jan 13, 2026
  • Signals
  • Faroq Awin + 2 more

This survey delves into the Firebug Swarm Optimization (FSO) algorithm, an advanced global optimization algorithm that plays a pivotal role in modern swarm intelligence optimization techniques. It explores the core principles of the FSO algorithm and examines the various hybrid variants developed to address complex optimization challenges. This survey also traces the evolution of swarm optimization methods, shedding light onto the natural phenomena and biological processes that have inspired these algorithms. Furthermore, it highlights the diverse real-world applications of the FSO algorithm, showcasing its effectiveness in fields such as engineering, data science, and artificial intelligence. To provide a comprehensive comparison, the survey includes a case study that evaluates the FSO algorithm’s performance against other existing algorithms. Lastly, the survey identifies key open research questions and suggests potential future directions for advancing the FSO algorithm and other nature-inspired optimization techniques, aiming to overcome current limitations and unlock new possibilities.

  • New
  • Research Article
  • 10.21294/1814-4861-2025-24-6-108-126
Innovative genomic technologies for non-invasive cancer screening
  • Jan 13, 2026
  • Siberian journal of oncology
  • L N Lyubchenko + 11 more

Cancer is a leading cause of mortality worldwide and the focus of priority programs and strategies for scientific and technological development in public health and health promotion. Modern screening technologies are aimed at early cancer diagnosis to improve treatment outcomes; however most of them are characterized by invasiveness and low patient compliance. Therefore, non-invasive cancer diagnosis is a promising field in molecular biology and oncology. Advances in molecular genetics and bioinformatics have enabled the identification of a wide range of diagnostic, prognostic, and predictive biomarkers, which can be analyzed not only in tumor tissue samples but also in peripheral blood. the purpose of the study was to analyze and summarize current scientific and practical data in the field of non-invasive cancer screening using molecular biological analysis, development of innovative test systems and diagnostic kits, as well as issues of legal regulation and integration into medical and social insurance programs. Material and Methods . The study was based on Russian and international scientific databases, including the National Library of Medicine using the PubMed electronic resource, Elibrary, and Google scholar search results. Open internet resources were also searched using the keywords: cancer; malignant neoplasm; tumor; diagnostic; non-invasive; early; blood; Blood-based tests; test system; screening; pancancer; multi-cancer; sequencing; PCR; marker; DNA; cfDNA; multi-cancer early detection; MCED. The analytical review included clinical trial reports, meta-analyses, systematic reviews, and cohort randomized trials for the period 2008–2025. Results. There is a steady trend worldwide towards the widespread adoption of universal, non-invasive methods for early cancer diagnosis. Retrospective and prospective multicenter studies and meta-analyses conducted over the past 15 years have demonstrated advances in interdisciplinary multimodal analysis of diverse patient data (clinical, genomic, transcriptomic, epigenomic, etc.), emphasizing the cost-effectiveness of these methods. Conclusion . Currently, large-scale population-based studies considering race and ethnicity are vital for validating methodological approaches and evaluating the effectiveness of non-invasive cancer screening methods, especially in diverse nations like Russia.

  • New
  • Research Article
  • 10.1162/99608f92.c8780a45
Prompting the Professoriate: A Qualitative Study of Instructor Perspectives on LLMs in Data Science Education
  • Jan 13, 2026
  • Harvard Data Science Review
  • Ana Elisa Lopez-Miranda + 2 more

Prompting the Professoriate: A Qualitative Study of Instructor Perspectives on LLMs in Data Science Education

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers