Abstract Introduction and purpose Atrial fibrillation (AF) is the most common arrhythmia worldwide, with a considerable prevalence, high morbidity, mortality, and finantial cost in Europe. To optimise the quality of medical care received in patients with AF, we need to know and investigate their accurate demographic and clinical typology and the actual patient journey, which involves many data to review and a high number of patients included. The CHA2DS2Vasc score classifies the risk of stroke in patients with atrial fibrillation, one of the most critical complications of this arrhythmia, and assists decision-making. A part of Big Data, Data mining, process mining, and business intelligence techniques can analyse a high volume of clinical and non-clinical data. Methods Big Data pre-processing, data mining, process mining and business intelligence techniques were applied. Databases storing clinical and administrative information related to hospital discharges, highly complex procedures, emergency care and specialised practices from 2016 to 2020 were used. Patients with a principal or secondary diagnosis of AF were selected. Data sources included the Basic Minimum Set of Data (BMSD), containing administrative and clinical information at hospital discharge and resources at the Cardiology outpatient clinic. Once the databases were free of noise, inconsistencies, anomalies and duplicates, they were simultaneously reduced into smaller datasets. CHA2DS2Vasc score at the time of the patient's first contact with the system was calculated using BMSD information. Data integration techniques were applied to combine the extracted datasets into a single data source. Data transformation and reduction techniques were used, and a global dataset was generated for exploitation with process mining and business intelligence tools. Results After analysing 10942 individual BMSD, the CHA2DS2Vasc score was calculated for 6870 unique patients from 2016 to 2020. The most prevalent score was 5 (36.07%) (Figure 1), 4.19% of patients had a score of 1, and the first quartile was score 4. Nearly half of the patients (49.6%) were women. A large portion of patients (69.46%) were aged ≥75 (Figure 2). Diabetes mellitus was present in 25.21% of patients, high blood pressure in 64.76%, previous stroke in 8.59%, and none had a history of arterial disease. Conclusions Patients with atrial fibrillation treated in a tertiary university institution show a very high risk of stroke, according to the CHAD2DS2Vasc score. Almost 70% of them are aged 75 years and over. A high number of data and a high volume of patients can be analysed through data mining and business intelligence. Funding Acknowledgement Type of funding sources: Private grant(s) and/or Sponsorship. Main funding source(s): BMS-PfizerBoehringer