Discovering Transcription Factor Binding Sites (TFBS) has immense significance in terms of developing techniques and evaluating regulatory processes in biological systems. The DNA gene sequence encompasses large volume of datasets so a new methodology is needed to analyze them in the quickest possible time. Over the past decades, the planted (l, d) motif discovery methodology has been used for locating TFBS in the genetic region. This paper focuses on developing a new approach for motif identification using planted (l, d) motif discovery algorithm. The proposed algorithm is named ESMD (Emerging Substring based Motif Detection), which is based on two processes: Mining and Combining Emerging Substrings. In the mining step, an array is initially created, based on the suffix array (SA) and the longest common prefix array (LCP). A MapReduce programming model handles the mining of emerging substring process since DNA gene sequences constitute huge data. The next step combines the emerging substrings of different lengths. The resulting models have been evaluated using two different metrics, the Pearson Correlation Coefficient (PCC) and the Area Under Curve (AUC). Both have produced much better results than existing methods.