The present study proposes a joint denoising method for real-time drilling lithology identification based on drilling sound signals. The proposed method employs ensemble empirical mode decomposition (EEMD) for preliminary sound signal denoising while retaining the effective intrinsic mode function (IMF) components, which then undergo secondary denoising using the improved wavelet threshold function. Finally, the sound signal is reconstructed using the denoised IMF components. The experimental results show that the proposed method achieves a higher signal-to-noise ratio and lower root mean square error, reaching 12.94 and 0.0442, respectively. Furthermore, the proposed method demonstrated good performance in denoising rock drilling sound signals and successfully addressed issues including the susceptibility of sound signals to external interference, nonadaptive threshold selection in wavelet threshold denoising, and high reconstruction errors during EEMD in drilling environments. Simulation results show that the proposed method significantly improves the accuracy of lithology recognition by 23%, with a resulting accuracy of 72.58%.