The unparalleled availability of Satellite Image Time Series (SITS) for crop phenology classification unravels agricultural parcel observation and monitoring with applications of both economic and ecological importance. Moreover, the need for distinct classification of agricultural parcels into individual crop types falls on state-of-the-art deep learning models for this extrinsic task. However, most existing approaches implemented are complex and ineffective attention incorporated models, which in turn lack the resilience to recognize useful bands in achieving greater accuracy. We propose a Multi-Fast Channel Attention module for deep CNNs based on a Spatial Encoder (SE-MFCA) that requires a few parameters while enhancing the performance-complexity trade-off dilemma. Hence, we leverage on spatial encoder module to extract the images as disorderly sets of pixels to enhance the coarse spatial resolution features. We empirically show that appropriate parameter sharing in the cross channel interaction can preserve performance while significantly reducing model complexity. The proposed multi-channel attention module can efficiently be implemented via an encoder-decoder network to prevent the loss of detailed spatial information. Again, we parallelly distributed the input channel into multiple heads in our network to recover the specialized input features, which will concatenate with the residual to form a rich single feature representation. The extensive experiment has shown that our model SE-MFCA is efficient and effective compared with the previous state-of-the-art time series classification algorithm on a publicly available dataset of Sentinel-2 images for agricultural parcels. Performance-wise SE-MFCA achieves the highest overall accuracy of 94.50% and the highest mean intersection over union score of 51.92%, besides the least trainable params of 131 K and fewer floating point operations of 0.16 M.
Read full abstract