Successful and efficient pest management is key to sustainable horticultural food production. While greenhouses already allow digital monitoring and control of their climate parameters, a lack of digital pest sensors hinders the advent of digital pest management systems. To close the control loop, digital systems need to be enabled to directly assess the state of different insect populations in a greenhouse. The presented article investigates the feasibility of acoustic sensors for insect detection in greenhouses. The study is based on an extensive dataset of acoustic insect recordings made with an array of high-quality microphones under noise-shielded conditions. By mixing these noise-free laboratory recordings with environmental sounds recorded with the same equipment in a greenhouse, different signal-to-noise ratios (SNR) are simulated. To explore the possibilities of this unique and novel dataset, two deep-learning models are trained on this simulation data. A simple spectrogram-based model represents the baseline for a comparison with a model capable of processing multi-channel raw audio data. Making use of the unique possibility of the dataset, the models are pre-trained on clean data and fine-tuned on noisy data. Under lab conditions, results show that both models can make use of not just insect flight sounds but also the much quieter sounds of insect movements. First attempts under simulated real-world conditions showed the challenging nature of this task and the potential of spatial filtering. The analysis enabled by the proposed methods for training and evaluation provided valuable insights that should be considered for future work.