A mask-based lensless camera optically encodes the scene with a thin mask and reconstructs the image afterward. The improvement of image reconstruction is one of the most important subjects in lensless imaging. Conventional model-based reconstruction approaches, which leverage knowledge of the physical system, are susceptible to imperfect system modeling. Reconstruction with a pure data-driven deep neural network (DNN) avoids this limitation, thereby having potential to provide a better reconstruction quality. However, existing pure DNN reconstruction approaches for lensless imaging do not provide a better result than model-based approaches. We reveal that the multiplexing property in lensless optics makes global features essential in understanding the optically encoded pattern. Additionally, all existing DNN reconstruction approaches apply fully convolutional networks (FCNs) which are not efficient in global feature reasoning. With this analysis, for the first time to the best of our knowledge, a fully connected neural network with a transformer for image reconstruction is proposed. The proposed architecture is better in global feature reasoning, and hence enhances the reconstruction. The superiority of the proposed architecture is verified by comparing with the model-based and FCN-based approaches in an optical experiment.
Read full abstract