Discovering customer intents from text or speech data plays a vital role in text mining and automated dialogue response. It is challenging to process thousands of customer interactions daily. Deep Embedded Clustering (DEC) and Improved DEC (IDEC) with Kullback–Leibler loss handle a lot of data inefficiently due to the asymmetric nature of the loss. To address the challenge, an unsupervised learning approach to discover intents and automatically produce the labels from a collection of unlabeled utterances in the context of the banking domain is proposed. The proposed approach focuses on improving both architectures of DEC and IDEC by combining the Jensen–Shannon (JS) divergence to simultaneously learn feature representations and cluster assignments, and the Second-order Clipped Stochastic Optimization (Sophia). Then, a set of intent labels for each cluster is generated by using a dependency parser in the second stage. Experimental results showed that the proposed approach is capable of generating meaningful intent labels and short text clustering with high performance.