Proteins are responsible for most biological functions, many of which require the interaction of more than one protein molecule. However, accurately predicting protein-protein interaction (PPI) sites (the interfacial residues of a protein that interact with other protein molecules) remains a challenge. The growing demand and cost associated with the reliable identification of PPI sites using conventional experimental methods call for computational tools for automated prediction and understanding of PPIs. We present Pair-EGRET, an edge-aggregated graph attention network that leverages the features extracted from pre-trained transformer-like models to accurately predict PPI sites. Pair-EGRET works on a k-nearest neighbor graph, representing the three-dimensional structure of a protein, and utilizes the cross-attention mechanism for accurate identification of interfacial residues of a pair of proteins. Through an extensive evaluation study using a diverse array of experimental data, evaluation metrics, and case studies on representative protein sequences, we demonstrate that Pair-EGRET can achieve remarkable performance in predicting PPI sites. Moreover, Pair-EGRET can provide interpretable insights from the learned cross-attention matrix. Pair-EGRET is freely available in open source form at the GitHub Repository https://github.com/1705004/Pair-EGRET. Supplementary data are available at Bioinformatics online.