Fitting a hidden Markov Model (HMM) to neural data is a powerful method to segment a spatiotemporal stream of neural activity into sequences of discrete hidden states. Application of HMM has allowed to uncover hidden states and signatures of neural dynamics that seem relevant for sensory and cognitive processes. This has been accomplished especially in datasets comprising ensembles of simultaneously recorded cortical spike trains. However, the HMM analysis of spike data is involved and requires a careful handling of model selection. Two main issues are: (i) the cross-validated likelihood function typically increases with the number of hidden states; (ii) decoding the data with an HMM can lead to very rapid state switching due to fast oscillations in state probabilities. The first problem is related to the phenomenon of over-segmentation and leads to overfitting. The second problem is at odds with the empirical fact that hidden states in cortex tend to last from hundred of milliseconds to seconds. Here, we show that we can alleviate both problems by regularizing a Poisson-HMM during training so as to enforce large self-transition probabilities. We call this algorithm the 'sticky Poisson-HMM' (sPHMM). When used to-gether with the Bayesian Information Criterion for model selection, the sPHMM successfully eliminates rapid state switching, outperforming an alternative strategy based on an HMM with a large prior on the self-transition probabilities. The sPHMM also captures the ground truth in surrogate datasets built to resemble the statistical properties of the experimental data.