Automatic pollen detection based on light microscope (LM) images is helpful for pollinosis symptoms prevention. Recently, many deep learning methods have been proposed to identify pollen grains based on multi-scale feature fusion mechanism. However, in real scenarios, there are two main challenges that need to be considered: (1) Complex pollen characteristics; (2) Irrelevant objects interference. It means that the pollen detection requires not only learning the relationship among multi-scale features but also refining the feature representation. To this end, this paper proposes an attention-based multi-scale feature fusion network (AMFF-Net) for automatic pollen detection on real-world LM images. The proposed AMFF-Net includes three modules: The feature extraction module utilizes the series-connection attention to capture the spatial and channel dependencies of different level feature maps (for solving the challenge 1). In the feature fusion module, a parallel connection attention is able to learn more discriminative feature representation based on bidirectional pathway guidance (for solving the challenge 2). Both are jointly adopted to enhance the representational capacity of the final results in the pollen prediction module. Extensive experiments are conducted on the real-world RPD dataset, and our AMFF-Net achieves the best performance (83.9% of mean average precision) comparing with other state-of-the-art methods. We believe that this work can serve as an important reference for the development of pollen monitoring system in a real scenario.