Acquiring high-resolution light fields (LFs) is expensive. LF angular superresolution aims to synthesize the required number of views from a given sparse set of spatially high-resolution images. Existing methods struggle with sparsely sampled LFs captured with large baselines. Some methods rely on depth estimation and view reprojection, and are sensitive to textureless and occluded regions. Other non-depth based methods suffer from aliasing or blurring effects due to the large disparity. In addition, most methods require specific models for different interpolation rates, which reduces their flexibility in practice. In this paper, we propose a learning framework that overcomes these challenges by exploiting the global and local structures of LFs. Our framework includes aggregation across both the angular and spatial dimensions to fully exploit the input data and a novel bilateral upsampling module that upsamples each epipolar plane image while better preserving its local parallax structure. Furthermore, our method predicts the weights of the interpolation filters based on both subpixel offset and range difference, allowing angular superresolution at different rates with a single model. We show that our non-depth based method outperforms the state-of-the-art methods in terms of handling large disparities and flexibility on both real-world and synthetic LF images.