Single-cell RNA sequencing (scRNA-seq) analysis is capable of elucidating cell heterogeneity and diversity, making cell-level biological research possible. Cell type clusterization is one of the main goals of scRNA-seq analysis. However, existing methods lack a flexible aggregation mechanism to fuse node attribute features and graph structural features adaptively. They also ignore multi-scale information embedded in different layers of networks. Here, we propose AGAC (Attention-driven Graph Attentional Clustering), a unified computational framework that applies hierarchical feature aggregation with mixed attentional mechanisms to scRNA-seq data clustering. Firstly, AGAC learns the attribute features of cells through graph attention autoencoder and the graph structural features among cells through graph neural network (GNN) respectively. Secondly, AGAC utilizes the heterogeneous intelligent fusion module to fuse these two kinds of features dynamically, as well as the multi-scale intelligent fusion module to aggregate the multi-scale information embedded in different layers of the GNN adaptively. In addition, AGAC adopts a dual self-supervision module for end-to-end training and synchronous optimization. Experimental results on 13 real scRNA-seq datasets demonstrate that AGAC is more effective than existing baseline methods because it takes into account the discriminative information in the network comprehensively and generates clustering results directly. The source code of AGAC can be downloaded at https://github.com/zzyqh/AGAC.