This letter consider a reconfigurable intelligent surface (RIS) aided multi-user multiple-input single-output (MISO) downlink system, where transmit beamforming and phase shifts of RIS reflecting elements are jointly designed to maximize system sum rate. However, the unit modulus constraint of RIS phase shifts and coupling between active and passive beamforming make the optimal design a challenging task. Most of prior works adopt iterative optimization algorithms to get suboptimal solutions, which suffer from high computational complexity, hence are not applicable to practical scenarios. Responding to this, this letter proposes a deep learning based approach for joint active and passive beamforming design. Specifically, a two-stage neural network is trained offline in an unsupervised manner, which is then deployed online for real-time prediction. Simulation results indicate that the proposed approach is able to reduce computational complexity significantly with satisfactory performance compared to conventional iterative optimization algorithms.