Published in last 50 years
Articles published on Detection Bias
- New
- Research Article
- 10.1093/ehjdh/ztaf131
- Nov 7, 2025
- European Heart Journal - Digital Health
- Katherine Krieger + 11 more
Abstract Aims Large language models (LLMs) such as GPT are increasingly used to generate clinical teaching cases and support diagnostic reasoning. However, biases in their training data may skew the portrayal and interpretation of cardiovascular symptoms in women, potentially leading to delayed or inaccurate diagnoses. We assessed GPT-4o’s and GPT-4’s gender representation in simulated cardiovascular cases and GPT-4o’s diagnostic performance across genders using real patient notes. Methods and Results First, GPT-4o and GPT-4 were each prompted to generate 15,000 simulated cases spanning 15 cardiovascular conditions with known gender prevalence differences. The model’s gender distributions were compared to U.S. prevalence data from CDC/STS datasets using FDR-corrected χ² tests, finding a significant deviation (p<0.0001). In 14 GPT-4-generated conditions (93%), male patients were overrepresented compared to females by a mean of 30% (SD 8.6%). Second, fifty de-identified cardiovascular patient notes were extracted from the MIMIC-IV-Note database. Patient gender was systematically swapped in each note, and GPT-4 was asked to produce differential diagnoses for each version (10,000 total prompts). Diagnostic accuracy across genders was determined by comparing model outputs to actual discharge diagnoses via FDR-corrected Mann-Whitney U tests, revealing significant diagnostic accuracy differences in 11 cases (22%). Female patients received lower accuracy scores than males for key conditions like coronary artery disease (p<0.01), abdominal aortic aneurysm (p<1.0×10-9), and atrial fibrillation (p<0.01). Conclusions GPT-4o underrepresented women in simulated cardiovascular scenarios and less accurately diagnosed female patients with critical conditions. These biases risk reinforcing historical disparities in cardiovascular care. Future efforts should focus on bias detection and mitigation.
- New
- Research Article
- 10.3390/app152111832
- Nov 6, 2025
- Applied Sciences
- Kaicheng Xu + 2 more
The rapid proliferation of multimodal misinformation across diverse news categories poses unprecedented challenges to digital ecosystems, where existing detection systems exhibit critical limitations in domain adaptation and fairness. Current methods suffer from two fundamental flaws: (1) severe performance variance (>35% accuracy drop in education/science categories) due to category-specific semantic shifts; (2) systemic real/fake detection bias causing up to 68.3% false positives in legitimate content—risking suppression of factual reporting especially in high-stakes domains like public health discourse. To address these dual challenges, this paper proposes the DATTAMM (Domain-Adaptive Tensorized Multimodal Model), a novel framework integrating category-aware attention mechanisms and adversarial debiasing modules. Our approach dynamically aligns textual–visual features while suppressing domain-irrelevant noise through the following: (a) semantic disentanglement layers extracting category-invariant patterns; (b) cross-modal verification units resolving inter-modal conflicts; (c) real/fake gradient alignment regularizers. Extensive experiments on nine news categories demonstrate that the DATTAMM achieves an average F1-score of 0.854, outperforming state-of-the-art baselines by 32.7%. The model maintains consistent performance with less than 5.4% variance across categories, significantly reducing accuracy drops in education and science content where baselines degrade by over 35%. Crucially, the DATTAMM narrows the real/fake F1 gap to merely 0.017, compared to 0.243–0.547 in baseline models, while cutting false positives in high-stakes domains like health news to 5.8% versus the 38.2% baseline average. These advances lower societal costs of misclassification by 79.7%, establishing a new paradigm for robust and equitable misinformation detection in evolving information ecosystems.
- New
- Research Article
- 10.18060/28625
- Nov 5, 2025
- Advances in Social Work
- Alicia Tetteh + 1 more
This paper examines the critical need for enhancing diversity, equity, and inclusion (DEI) within the social work licensing and credentialing process. Recent analyses have revealed significant disparities in pass rates for licensing exams among underrepresented groups, including African American, Latinx, and older social work candidates. These inequities raise ethical concerns and challenge the foundational values of the social work profession, which is rooted in principles of social justice and empowerment. By analyzing structural barriers and biases within the licensing process, this paper identifies key areas for reform, including the need for inclusive exam content, cultural competence training for test developers, and comprehensive support systems for non-traditional candidates. Furthermore, the paper explores how technology and artificial intelligence can play a transformative role in addressing DEI issues, such as bias detection, personalized learning experiences, and improved accessibility for diverse candidates. Ultimately, the paper argues that a commitment to DEI in the licensing process is not only a moral imperative but also essential for fostering a social work profession that truly reflects and serves the diverse communities it aims to support. By implementing targeted reforms and leveraging technological innovations, the profession can advance its ethical mission and promote equity in the path to licensure.
- New
- Research Article
- 10.1051/0004-6361/202554997
- Nov 5, 2025
- Astronomy & Astrophysics
- Aku Venhola + 1 more
Constraining the properties, spatial distribution, and luminosity function of dwarf galaxies in different galactic environments is crucial for understanding the dwarf galaxy formation and evolution. Large surveys such as the Kilo Degree Survey (KiDS) provide useful publicly available datasets that can be used to identify dwarf galaxy candidates in a range of galactic neighborhoods. The resulting catalogs are useful for constraining the abundance of dwarfs in different environments and also provide useful galaxy samples for future follow-up studies. Ultimately this analysis of low-mass galaxies also provides constraints on our cosmological galaxy formation models. We generated a dwarf galaxy candidate catalog based on the KiDS images. KiDS data covers a 1004 deg^2 area in u', g', r', and i' filters that is centered on two horizontal stripes at the equator and in the southern hemisphere. In our catalog we provide the locations, photometric properties, and visual classifications of dwarf galaxy candidates within 60 Mpc in all different environments covered by the KiDS. We also use the catalog to analyze the dwarf galaxy numbers and distributions in groups as a function of groups' virial mass. We used Max-Tree Objects (MTO) to identify sources from the KiDS data. We then selected objects based on their detection sizes and surface brightness. We used automated photometric pipeline to run GALFIT on the images in order to measure the structure, brightness, and color of the objects. We then used size, surface brightness, and color cuts to exclude the likely background galaxies and classify the likelihoods of the remaining objects being dwarf galaxies based on their visual appearance. We also probed the completeness limits and detection biases of our detection procedure, by embedding simulated galaxies into the KiDS images. Our catalog contains galaxies that have R_e larger than 3 arcsec and reaches the 50% completeness limit at the r'-band mean effective surface brightness of 26 mag arcsec^-2. Near the completeness limit there is a slight selection bias toward detecting more round and centrally peaked objects more effectively than the more elongated and centrally flat. Altogether we identified $4 objects from the KiDs data. After applying the size, color, and surface brightness cuts, we were left with 6230 objects for which we performed photometry and visual classifications. We ranked those objects into five classes based on their likeliness of being a dwarf. We identified 763 galaxies as clear dwarfs, 793 as likely dwarfs, and 933 as possible dwarfs. The remaining objects are likely not dwarfs. Based on the distances of groups that the dwarfs are likely to be associated with, the dwarfs are expected to lie at distances of between 14 Mpc -- 60 Mpc. The majority of dwarfs in the sample have magnitudes of between 14 mag < m_r < 20 mag, effective radii of between 1 arcsec < R_e < 30 arcsec, and mean effective surface brightnesses of between 21 mag arcsec^-2 < < 25 mag arcsec^-2 . We compare the measured properties of the galaxies in our catalog with values from the literature and find mostly good agreement between those, when considering the differences in the data qualities. The only exceptions are the effective radii, which are systematically smaller in our catalog, due to the background subtraction method used in the KiDS data reduction. We also identify the most likely associations with groups and cluster for all the dwarfs in our catalog. Additionally we compare the number of dwarfs and their distribution within the groups with similar dwarfs found in the Illustris-TNG simulations. We find no statistically significant tension between the dwarf numbers and distributions between the observations and the simulations. Our catalog contains locations, colors, structural parameters, and likely group memberships for 2489 dwarf galaxy candidates. All the measurements are publicly available. The catalog can be used to study properties of dwarfs in a range of environments and it provides a good dataset for follow-up studies.
- New
- Research Article
- 10.1108/el-05-2025-0166
- Nov 4, 2025
- The Electronic Library
- Akinade Adebowale Adewojo
Purpose The increasing use of artificial intelligence (AI)-based bibliometric tools in academic libraries raises significant ethical concerns. Existing research has largely overlooked how librarians, particularly in the Global South, experience and address these issues in practice. This study aims to investigate the perspectives and strategies of academic librarians in Nigerian federal universities, aiming to develop a grounded theoretical model that explains their approaches to balancing technological benefits with ethical responsibilities in research evaluation. Design/methodology/approach Classic grounded theory was used to guide semi-structured interviews with 18 participants (13 research support librarians and five research office staff) from six Nigerian federal universities. Interviews were conducted via Zoom between 6 January and 3 February 2025 and analysed using iterative open, axial and selective coding. Findings A core category, Balancing Accuracy with Accountability, emerged to explain how librarians reconcile the benefits of AI-powered bibliometric tools with ethical concerns. Four interrelated subcategories support this core concern: bias detection, transparency demands, consent and data ownership and trust-building practices. Together, these form the basis of the proposed mid-range theory of Contextual Accountability in Ethical AI Adoption, which emphasises librarians’ active role in shaping ethical AI integration based on institutional context, professional values and local constraints. Originality/value To the best of the author’s knowledge, this study is among the first to develop a grounded theory of ethical AI adoption in bibliometrics from the perspective of academic librarians in a Global South context. It contributes original insights into how librarians mediate the ethical implications of AI tools and offers a context-sensitive theoretical model that can inform institutional policy, tool design and professional development. The study challenges universalist assumptions about AI ethics by centring the situated knowledge and agency of librarians working in underrepresented regions.
- New
- Research Article
- 10.1177/14727978251393473
- Nov 4, 2025
- Journal of Computational Methods in Sciences and Engineering
- Yanyan Tian
American literature has long served as a mirror, reflecting the diverse cultural, social, and political landscapes of the United States. This research investigates the representation of social groups in American literature by employing advanced natural language processing techniques. Specifically, it utilizes contextualized word embedding models to analyze how characters from diverse social identities, particularly in terms of gender, race, and class, are portrayed across a curated corpus of canonical and contemporary American literary texts. The dataset is compiled and preprocessed through tokenization and normalization to prepare the texts for contextual embedding extraction and bias analysis. Bias detection is conducted using a Bidirectional Encoder Representations mutated Weighted Support Vector Machine (BERWSVM) model designed to classify complex social representations. The Contextualized Embedding Association Test (CEAT) isemployed to statistically evaluate the strength of association between social groups and character traits by computing cosine distances between contextual embeddings. Bidirectional Encoder Representations from Transformers (BERT) are used to extract rich semantic representations from the texts, capturing character descriptions, group identity references, and associated traits. The WSVM component classified intersectional group embeddings, enabling the assessment of representational patterns that extend beyond single-identity categorizations. Implemented in Python, the findings show that the BERWSVM approach performs better than multimodal baseline architectures, achieving superior results, with accuracy, F1-score, recall, and precision ranging from 90% to 95%. The findings reveal that the BERWSVM achieved high accuracy in distinguishing characters belonging to intersectional groups, significantly outperforming traditional baseline models. It shows the effectiveness of integrating computational bias detection algorithms with literary interpretation in analyzing social ideologies, representation, diversity, and fairness in narrative structures.
- New
- Research Article
- 10.1016/j.ipm.2025.104244
- Nov 1, 2025
- Information Processing & Management
- Timo Spinde + 5 more
Enhancing media literacy: The effectiveness of (Human) annotations and bias visualizations on bias detection
- New
- Research Article
- 10.1109/tpami.2025.3592901
- Nov 1, 2025
- IEEE transactions on pattern analysis and machine intelligence
- Moreno D'Inca + 5 more
Recent progress in Text-to-Image (T2I) generative models has enabled high-quality image generation. As performance and accessibility increase, these models are gaining significant attraction and popularity: ensuring their fairness and safety is a priority to prevent the dissemination and perpetuation of biases. However, existing studies in bias detection focus on closed sets of predefined biases (e.g., gender, ethnicity). In this paper, we propose a general framework to identify, quantify, and explain biases in an open set setting, i.e. without requiring a predefined set. This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions. Next, these captions are used by the target generative model for generating a set of images. Finally, Vision Question Answering (VQA) is leveraged for bias evaluation. We show two variations of this framework: OpenBias and GradBias. OpenBias detects and quantifies biases, while GradBias determines the contribution of individual prompt words on biases. OpenBias effectively detects both well-known and novel biases related to people, objects, and animals and highly aligns with existing closed-set bias detection methods and human judgment. GradBias shows that neutral words can significantly influence biases and it outperforms several baselines, including state-of-the-art foundation models.
- New
- Research Article
- 10.1016/j.jmpt.2025.10.023
- Nov 1, 2025
- Journal of manipulative and physiological therapeutics
- Cristiano Carvalho + 5 more
Effect of Aerobic Exercise on Fatigue and Quality of Life in Individuals With Fibromyalgia: A Systematic Review of Randomized Controlled Trials.
- New
- Research Article
- 10.1016/j.dld.2025.08.008
- Nov 1, 2025
- Digestive and liver disease : official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver
- Alessandro Vitale + 6 more
Liver transplantation for intestinal malignancies.
- New
- Research Article
- 10.1111/ene.70396
- Nov 1, 2025
- European journal of neurology
- Ioannis N Petropoulos + 6 more
Peripheral neuropathy (PN) may be diagnosed late or may remain undiagnosed. Studies have shown that measurement of corneal nerve fiber length (CNFL) using corneal confocal microscopy (CCM) may have diagnostic utility in diabetic and other peripheral neuropathies. The main databases [CENTRAL, Embase (Ovid), and PubMed] were searched for peer-reviewed literature. Gray literature searching was undertaken using the ProQuest Dissertations & Theses database. The updated Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines were used by two authors independently for screening and data extraction. The primary outcome of CNFL was represented by the standardized mean difference and 95% confidence interval, and differences between healthy controls (HC), patients with sub-clinical (PN-), and clinical (PN+) peripheral neuropathy were assessed. Sensitivity analysis was performed to assess the risk of bias. We identified n = 52 eligible studies (n = 2995 participants) reporting CNFL across 34 different conditions associated with PN. CNFL was significantly lower in PN+ patients compared to HC (standardized mean difference -1.12, 95% confidence interval -1.32 to -0.33, p < 0.00001), in PN- patients compared to HC (-0.78, -0.99 to -0.57, p < 0.00001), and in PN+ compared to PN- patients (-0.93, -1.53 to -0.33, p = 0.002). The results remained significant following sensitivity analysis to adjust for the risk of potential detection bias. The results remained significant independent of the choice of a random or fixed effects model. This systematic review and meta-analysis shows that CNFL has utility in the diagnosis of peripheral neuropathies.
- New
- Research Article
- 10.1016/j.ajog.2025.04.052
- Nov 1, 2025
- American journal of obstetrics and gynecology
- Balázs Vida + 8 more
Assessing the comparative efficacy of sentinel lymph node detection techniques in vulvar cancer: a systematic review and meta-analysis.
- New
- Research Article
- 10.1177/15473287251388411
- Oct 30, 2025
- Stem cells and development
- Christophe Desterke + 5 more
Mesenchymal stromal cells (MSCs) are currently used in clinical practice as a therapeutic agent for immunomodulation and tissue repair. They are found in all supporting tissues, including perinatal tissues such as umbilical cord and amniotic membranes (amnion and chorion). Perinatal tissues have attracted interest due to their availability, minimal ethical and legal concerns, and high banking potential for allogeneic applications. Many studies have compared the efficacy of MSCs from different sources, without reaching a consensus on the most effective to use in a given clinical situation. This study compared the transcriptomic signatures of MSCs derived from adult bone marrow (BM-MSCs)-the reference source most widely used in clinical trials-with those of perinatal MSCs (P-MSCs). Our data were analyzed jointly with three independent transcriptome datasets. Unsupervised principal component analysis revealed a major stratification according to tissue origin, accounting for 16.6% of the total transcriptomic variance, without any detectable bias from batch effects or cell culture procedures. Supervised differential expressed gene analysis between BM and perinatal samples revealed 819 genes presenting differential expression. Gene Set Enrichment Analysis highlighted that adult BM-MSCs are implicated in adipogenesis and osteoblast differentiation, whereas P-MSCs upregulated gene sets implicated in cell cycle regulation, functions classically described in the literature. Among the different sources of variability, we showed that perinatal tissues have a strongly distinct transcriptional signature compared with adult BM, independent of the production center or the culture conditions used. The in-depth study of transcript profiles therefore seems to remain a valuable and robust characterization tool for cell therapy banking.
- New
- Research Article
- 10.1017/s1047951125110147
- Oct 29, 2025
- Cardiology in the young
- Anna M Dehn + 9 more
Atrial septal defect is commonly considered a minor CHD, but morbidity and mortality are higher compared to the background population. Maternal pre-eclampsia is associated with CHD in the offspring in large registry-based studies. However, the association between pre-eclampsia and atrial septal defects might be subject to detection bias, as many atrial septal defects are asymptomatic or might remain undiagnosed until late in life. We investigated the association between maternal pre-eclampsia and the risk of atrial septal defects in a population-based cohort of neonates examined with echocardiography. Neonates included in the Copenhagen Baby Heart Study, who were examined using transthoracic echocardiography within 30 days of birth, were systematically assessed for atrial septal defects and patent foramen ovale using a standardised algorithm. Using log-linear binomial regression and polytomous logistic regression, we compared the risk of atrial septal defects in neonates exposed to maternal pre-eclampsia with the risk in neonates not exposed to pre-eclampsia. Our study cohort included 12,354 neonates (mean age, 11 days), including 462 exposed to maternal pre-eclampsia. Atrial septal defect was found in 5.9% (n = 732) of the study cohort and compared with unexposed neonates, neonates exposed to maternal pre-eclampsia had a modestly increased risk of atrial septal defects (adjusted risk ratio 1.19, 95% confidence interval 0.83, 1.64). Estimates were robust to various exclusions in sensitivity analyses. There appears to be an association between maternal pre-eclampsia and atrial septal defect in the neonate in a population-based cohort of neonates.
- New
- Research Article
- 10.3390/forensicsci5040053
- Oct 27, 2025
- Forensic Sciences
- Ido Hefetz
Background: Artificial intelligence is transforming forensic fingerprint analysis by introducing probabilistic demographic inference alongside traditional pattern matching. This study explores how AI integration reshapes the role of forensic experts from interpreters of physical traces to epistemic corridors who validate algorithmic outputs and translate them into legally admissible evidence. Methods: A conceptual proof-of-concept exercise compares traditional AFIS-based workflows with AI-enhanced predictive models in a simulated burglary scenario involving partial latent fingermarks. The hypothetical design, which does not rely on empirical validation, illustrates the methodological contrasts between physical and algorithmic inference. Results: The comparison demonstrates how AI-based demographic classification can generate investigative leads when conventional matching fails. It also highlights the evolving responsibilities of forensic experts, who must acquire competencies in statistical validation, bias detection, and explainability while preserving traditional pattern-recognition expertise. Conclusions: AI should augment rather than replace expert judgment. Forensic practitioners must act as critical mediators between computational inference and courtroom testimony, ensuring that algorithmic evidence meets legal standards of transparency, contestability, and scientific rigor. The paper concludes with recommendations for validation protocols, cross-laboratory benchmarking, and structured training curricula to prepare experts for this transformed epistemic landscape.
- New
- Research Article
- 10.54536/ajise.v4i3.5041
- Oct 25, 2025
- American Journal of Innovation in Science and Engineering
- Ifeoma Eleweke + 5 more
Cloud computing has become a cornerstone of modern IT infrastructure, offering scalability and efficiency but also exposing organizations to evolving cyber threats such as data breaches, insider threats, and advanced persistent threats (APTs). Traditional security mechanisms struggle to address these dynamic challenges, necessitating the integration of AI-driven threat detection and prevention strategies. This conceptual paper explores the comparative effectiveness of supervised learning, unsupervised learning, reinforcement learning, and hybrid AI models in cloud security. Supervised learning excels in identifying known attack patterns, while unsupervised learning is crucial for detecting zero-day threats and anomalies. Reinforcement learning enables self-adaptive security measures, and hybrid models offer a comprehensive, multi-layered approach to cloud security. However, AI-driven cybersecurity faces significant challenges, including data privacy risks, bias in threat detection, adversarial AI attacks, and lack of model interpretability. Emerging AI trends such as federated learning, quantum security, and explainable AI (XAI) are shaping the future of cloud security, while regulatory frameworks like GDPR, NIST AI Risk Management, and the EU AI Act play a crucial role in standardizing ethical AI use. This study provides insights into the strengths, weaknesses, and future directions of AI-driven cloud security, offering recommendations for researchers, policymakers, and cybersecurity practitioners to enhance AI resilience against emerging threats.
- New
- Research Article
- 10.3390/rs17213534
- Oct 25, 2025
- Remote Sensing
- Chenxi Zhao + 7 more
This research evaluates the performance of the Final Run remote sensing precipitation products from the Integrated Multi-satellite Retrievals for GPM (IMERG-F) in complex terrain river basins (2014–2023). Utilizing decade-long daily precipitation data from 2415 manned national-level ground stations, the evaluation employs eight statistical metrics—probability of detection, false alarm ratio, accuracy, critical success index, Pearson correlation coefficient (PCC), root mean square difference, mean difference, and relative difference—to analyze detection accuracy, correlation, and bias on daily, monthly, and annual scales. The main findings include the following: (1) IMERG-F’s daily precipitation detection capability follows a three-tier spatial pattern (northwest to southeast), aligning with the stepped terrain of China. (2) Stronger correlations (PCC = 0.7–0.9) with gauge data emerge in southeastern regions despite higher biases, while northwestern areas show weaker correlations but fewer deviations. (3) IMERG-F overestimates annual rainy days, but slightly underestimates precipitation intensity compared with ground observations. (4) Annual precipitation estimates exceed gauge measurements, particularly in the Songhua and Liao River Basins (18–20% overestimation). Monthly analysis shows fewer errors during rainy seasons versus winter dry periods, with pronounced seasonal variations in northwestern basins. These findings emphasize the need for terrain-aware calibration to improve satellite precipitation monitoring in hydrologically diverse basins, particularly addressing seasonal and spatial error patterns in water resource management applications in northern China.
- New
- Research Article
- 10.1126/sciadv.adz7312
- Oct 24, 2025
- Science Advances
- Tommaso Alfonsi + 3 more
Spillovers of zoonotic Influenza A viruses (IAVs) into farmed animals and humans have the potential to trigger epidemics or even global pandemics. We introduce FluWarning, a highly efficient and elegant computational method based on anomaly detection of codon bias and dinucleotide composition for early identification of divergent viral HA segments. We applied FluWarning to the 2009 influenza pandemic as a test case. FluWarning successfully identified the emergence of pdm09, the virus that caused the pandemic, with warnings preceding the observed global spread. Applied to H5N1 specimens collected between 2019 and 2025, FluWarning flagged genotypes D1.1 and B3.13, both associated with recent spillovers in dairy cows in the United States. In summary, FluWarning is an effective, lightweight, multiscale warning system for IAVs, detecting spillovers with few available sequences.
- New
- Research Article
- 10.1080/19312458.2025.2575468
- Oct 24, 2025
- Communication Methods and Measures
- Valerie Hase + 2 more
ABSTRACT Computational Social Science (CSS) increasingly engages in critical discussions about bias in and through computational methods. Two developments drive this shift: first, the recognition of bias as a societal problem, as flawed CSS methods in socio-technical systems can perpetuate structural inequalities; and second, the field’s growing methodological resources, which create not only the opportunity but also the responsibility to confront bias. In this editorial to our Special Issue on CSS and bias, we introduce the contributions and outline a research agenda. In defining bias, we emphasize the importance of embracing epistemological pluralism while balancing the need for standardization with methodological diversity. Detecting bias requires stronger integration of bias detection into validation procedures and the establishment of shared metrics and thresholds across studies. Finally, addressing bias involves adapting established and emerging error-correction strategies from social science traditions to CSS, as well as leveraging bias as an analytical resource for revealing structural inequalities in society. Moving forward, progress in defining, detecting, and addressing bias will require both bottom-up engagement by researchers and top-down institutional support. This Special Issue positions bias as a central theme in CSS – one that the field now has both the tools and the obligation to address.
- New
- Research Article
- 10.3343/alm.2025.0003
- Oct 23, 2025
- Annals of laboratory medicine
- Hyojin Kim + 9 more
Alpha-fetoprotein (AFP) and its isoform AFP-L3 are well-established serum biomarkers for hepatocellular carcinoma (HCC), a common malignancy and a leading cause of cancer-related mortality worldwide. Current methods for measuring these biomarkers are primarily lectin-based assays including the liquid-phase binding assay (LiBA) and liquid chromatography-tandem mass spectrometry (LC-MS/MS), both of which have limitations in diagnostic sensitivity and clinical utility for samples with low AFP concentrations. We aimed to develop a lectin-independent LC-MS/MS method for quantifying fucosylated AFP proteins (AFP-Fuc%). We conducted analytical validation, including method comparisons, over 2 months. The analytical sensitivity and diagnostic performance of this method were evaluated using 525 human serum samples-235 from HCC patients and 290 from non-HCC individuals-and compared with those of LiBA, which measured AFP-L3 levels. The LC-MS/MS method demonstrated acceptable within-laboratory imprecision (CVs<17.1%) without detectable bias, carryover, or matrix effects. Our method exhibited a broader linear dynamic range (spanning five orders of magnitude) and 10-fold higher analytical sensitivity than LiBA. The diagnostic performance of our method was significantly superior to that of LiBA, particularly in patients with low AFP concentrations (<7 ng/mL, P <0.001), with improved accuracy, sensitivity, and precision at a specificity of 96.2%. The validated LC-MS/MS method demonstrated robust analytical performance and superior diagnostic accuracy over LiBA for HCC diagnosis while avoiding the inherent limitations of lectin-based assays. Our LC-MS/MS assay shows promise for early HCC detection and may contribute to enhanced patient care.