By using computer-aided arrhythmia diagnosis tools, electrocardiogram (ECG) signal plays a vital role in lowering the fatality rate associated with cardiovascular diseases (CVDs) and providing information about the patient’s cardiac health to the specialist. Current advancements in deep-learning-based multivariate time series data analysis, such as ECG data classification include LSTM, Bi-LSTM, CNN, with Bi-LSTM, and other sequential networks. However, these networks often struggle to accurately determine the long-range dependencies among data instances, which can result in problems such as vanishing or exploding gradients for longer data sequences. To address these shortcomings of sequential models, a hybrid arrhythmia classification system using recurrence along with a self-attention mechanism is developed. This system utilizes convolutional layers as a part of representation learning, designed to capture the salient features of raw ECG data. Then, the latent embedded layer is fed to a self-attention-assisted transformer encoder model. Because the ECG data are highly influenced by absolute order, position, and proximity of time steps due to interdependent relationships among immediate neighbors, a component of recurrence using Bi-LSTM is added to the encoder model to address this characteristic of the data. The model performance indices such as classification accuracy and F1-score were found to be 99.2%. This indicates that the combination of recurrence along with self-attention-assisted architecture produces improved classification of arrhythmia from raw ECG signal when compared with the state-of-the-art models.