Robust 3D reconstruction depends critically on the accuracy of depth maps, and multi-view depth estimate is greatly aided by Multi-View Stereo (MVS). Nevertheless, challenges still exist, particularly in accurately estimating depth in regions with low texture and occlusion edges. We introduce MFA-MVSNet, a novel Multi-Feature Aggregation Multi-View Stereo Network that aggregates diverse features in order to address these issues. The Local Feature Extractor (LFE) is incorporated into MFA-MVSNet to adaptively capture local features across various receptive fields. It also incorporates a Multi-feature Fusion Module (MFM) to combine local features with global features extracted by the Global Feature Extractor (GFE). We utilize neural edge information in a CNN-based Depth Refinement Module (DRM) to iteratively filter the depth map to address depth errors in low-textured regions and occlusion edges. To further improve network robustness and training efficiency, we also introduce a Region Perception Loss (RPL) to lessen the impact of both easily matched and mismatched areas. On both the DTU and BlendedMVS datasets, experimental results demonstrate that MFA-MVSNet outperforms recent advanced methods in terms of depth map quality. Additionally, we incorporate MFA-MVSNet into a multi-view 3D reconstruction pipeline, utilizing its depth maps to enhance downstream tasks such as point cloud reconstruction and neural radiance fields (NeRF), leading to improved 3D reconstruction performance.