Fluorescence microscopy is essential to study biological structures and dynamics. However, existing systems suffer from a trade-off between field of view (FOV), resolution, and system complexity, and thus cannot fulfill the emerging need for miniaturized platforms providing micron-scale resolution across centimeter-scale FOVs. To overcome this challenge, we developed a computational miniature mesoscope (CM2) that exploits a computational imaging strategy to enable single-shot, 3D high-resolution imaging across a wide FOV in a miniaturized platform. Here, we present CM2 V2, which significantly advances both the hardware and computation. We complement the 3 × 3 microlens array with a hybrid emission filter that improves the imaging contrast by 5×, and design a 3D-printed free-form collimator for the LED illuminator that improves the excitation efficiency by 3×. To enable high-resolution reconstruction across a large volume, we develop an accurate and efficient 3D linear shift-variant (LSV) model to characterize spatially varying aberrations. We then train a multimodule deep learning model called CM2Net, using only the 3D-LSV simulator. We quantify the detection performance and localization accuracy of CM2Net to reconstruct fluorescent emitters under different conditions in simulation. We then show that CM2Net generalizes well to experiments and achieves accurate 3D reconstruction across a ~7-mm FOV and 800-μm depth, and provides ~6-μm lateral and ~25-μm axial resolution. This provides an ~8× better axial resolution and ~1400× faster speed compared to the previous model-based algorithm. We anticipate this simple, low-cost computational miniature imaging system will be useful for many large-scale 3D fluorescence imaging applications.