Clinical sleep analysis require manual analysis of sleep patterns for correct diagnosis of sleep disorders. However, several studies have shown significant variability in manual scoring of clinically relevant discrete sleep events, such as arousals, leg movements, and sleep disordered breathing (apneas and hypopneas). We investigated whether an automatic method could be used for event detection and if a model trained on all events (joint model) performed better than corresponding event-specific models (single-event models). We trained a deep neural network event detection model on 1653 individual recordings and tested the optimized model on 1000 separate hold-out recordings. F1 scores for the optimized joint detection model were 0.70, 0.63, and 0.62 for arousals, leg movements, and sleep disordered breathing, respectively, compared to 0.65, 0.61, and 0.60 for the optimized single-event models. Index values computed from detected events correlated positively with manual annotations (r2 = 0.73, r2 = 0.77, r2 = 0.78, respectively). We furthermore quantified model accuracy based on temporal difference metrics, which improved overall by using the joint model compared to single-event models. Our automatic model jointly detects arousals, leg movements and sleep disordered breathing events with high correlation with human annotations. Finally, we benchmark against previous state-of-the-art multi-event detection models and found an overall increase in F1 score with our proposed model despite a 97.5% reduction in model size. Source code for training and inference is available at https://github.com/neergaard/msed.git.