Accurate traffic prediction is crucial to alleviating traffic congestion in cities. Existing physical sensor-based traffic data acquisition methods have high transmission costs, serious traffic information redundancy, and large calculation volumes for spatiotemporal data processing, thus making it difficult to ensure accuracy and real-time traffic prediction. With the increasing resolution of UAV imagery, the use of unmanned aerial vehicles (UAV) imagery to obtain traffic information has become a hot spot. Still, analyzing and predicting traffic status after extracting traffic information is neglected. We develop a framework for traffic speed extraction and prediction based on UAV imagery processing, which consists of two parts: a traffic information extraction module based on UAV imagery recognition and a traffic speed prediction module based on deep learning. First, we use deep learning methods to automate the extraction of road information, implement vehicle recognition using convolutional neural networks and calculate the average speed of road sections based on panchromatic and multispectral image matching to construct a traffic prediction dataset. Then, we propose an attention-enhanced traffic speed prediction module that considers the spatiotemporal characteristics of traffic data and increases the weights of key roads by extracting important fine-grained spatiotemporal features twice to improve the prediction accuracy of the target roads. Finally, we validate the effectiveness of the proposed method on real data. Compared with the baseline algorithm, our algorithm achieves the best prediction performance regarding accuracy and stability.