The presence of naturally occurring contaminants in groundwater is a public health concern in rural areas of northeastern North America, where public and private wells are important sources of drinking water. In southern Quebec (Canada), inorganic groundwater chemistry data have been recently collected following standard procedures in several regional hydrogeological projects implemented by the government of Quebec. In this study, a groundwater chemistry database was compiled from 16 regional projects altogether covering an area of approximately 100,000 km2. The database includes information on water supply infrastructures, geological settings, hydrogeological conditions and inorganic water chemistry for 2369 water samples. Samples were mostly collected from private domestic wells, and to a lesser extent from municipal and observation wells. The data revealed that fluoride, barium, manganese and arsenic are the most common elements exceeding Canadian drinking water guidelines. Exploratory data analysis techniques were applied to selected subsets of data to gain insight into the sources and distribution of these hazardous groundwater contaminants. These exploratory methods include graphical data analysis (maps, Piper and empirical cumulative distribution function plots), multivariate compositional data analysis (clustering and correlation analysis) and geochemical modeling (saturation index calculations). The results suggest that fluoride, barium, manganese and arsenic are all derived from natural sources. Elevated fluoride concentrations are mainly associated with dilute Ca–Na–HCO3 bedrock groundwaters from granitic areas (Grenville Province), and more geochemically evolved Na–HCO3 to Na–HCO3–Cl bedrock groundwaters from shale areas (St. Lawrence Platform). High-F groundwaters are generally characterized by low Ca concentrations (<30 mg/L) and alkaline pH (pH > 8), suggesting that F is mainly controlled by fluorite (CaF2) precipitation and anion exchange with OH−. Barium is present at elevated concentrations in mineralized Ca–Na–HCO3, Na–HCO3 to Na–HCO3–Cl waters from bedrock aquifers of the St. Lawrence Lowlands. These groundwaters are mainly chemically evolved, strongly reducing waters occurring in confined aquifers and near major faults, which appear to correspond to discharge areas for deep regional flow. High Ba concentrations are generally associated with very low SO4 concentrations (< 5 mg/L) resulting from sulfate reduction, suggesting a solubility control of Ba through barite (BaSO4) precipitation. Most high manganese concentrations occur in less chemically evolved, near-neutral Ca–HCO3 groundwaters from both granular and bedrock aquifers, particularly those associated with metasedimentary and metavolcanic lithologies (Superior Province, St. Lawrence Platform and Appalachian Province). The results suggest that dissolved Mn concentrations are limited by the precipitation of Mn carbonates under alkaline conditions, but increase under reducing conditions owing to the dissolution of Fe–Mn oxyhydroxides. Elevated arsenic concentrations were mostly found in Ca-(Na–Mg)–HCO3 bedrock groundwaters of the Superior and Appalachian Provinces. As-rich groundwaters are associated with the presence of As-bearing sulfides in weakly metamorphosed sedimentary rocks (shale, slate, phyllite) and hydrothermally altered rocks. Most high As concentrations do not appear to be directly derived from sulfide oxidation, but rather from secondary sources, in particular through the reductive dissolution of As-rich Fe–Mn oxyhydroxides. This work shows that the combination of graphical, multivariate statistical and geochemical modeling techniques is a powerful approach to explore large hydrochemical datasets. It also reveals the benefits of using multivariate compositional data analysis instead of classical approaches based on raw data and log-ratios for groundwater chemistry data. Compositional data analysis techniques improved the detection of multivariate outliers, the cluster reliability in cluster analysis, and removed the spurious positive correlation between hydrochemical variables associated with total mineralization.