The temporal credit assignment (TCA) problem, which aims to detect predictive features hidden in distracting background streams, remains a core challenge in biological and machine learning. Aggregate-label (AL) learning is proposed by researchers to resolve this problem by matching spikes with delayed feedback. However, the existing AL learning algorithms only consider the information of a single timestep, which is inconsistent with the real situation. Meanwhile, there is no quantitative evaluation method for TCA problems. To address these limitations, we propose a novel attention-based TCA (ATCA) algorithm and a minimum editing distance (MED)-based quantitative evaluation method. Specifically, we define a loss function based on the attention mechanism to deal with the information contained within the spike clusters and use MED to evaluate the similarity between the spike train and the target clue flow. Experimental results on musical instrument recognition (MedleyDB), speech recognition (TIDIGITS), and gesture recognition (DVS128-Gesture) show that the ATCA algorithm can reach the state-of-the-art (SOTA) level compared with other AL learning algorithms.