SUMMARY Focal depth of earthquakes is essential for studies of seismogenic processes and seismic hazards. In regions with dense seismic networks, focal depth can be resolved precisely based on the traveltime of P and S, which is less feasible in case of sparse networks. Instead, surface waves are usually the strongest seismic phases at local and regional distances, and its excitation is sensitive to source depth, thus theoretically important for estimating focal depth even with a limited number of seismic stations. In this study, short-period (0.5–20 s) Rayleigh waves are explored to constrain focal depths. We observe that the optimal period (the period corresponding to the maximum amplitude) of Rayleigh waves at local distances (≤200 km) shows an almost linear correlation with focal depth. Based on this finding, we propose an automated method for resolving the focal depth of local earthquakes using the linear regression relation between the optimal period of Rayleigh wave amplitude spectra and focal depth. Synthetic tests indicate the robustness of this method against source parameters (focal mechanism, source duration and non-double-couple component) and crustal velocity structure. Although the attenuation (Q factor) of shallow crust can introduce complexities in determining focal depth, it can be simultaneously estimated if a sufficient number of stations are available. The proposed method is applied to tens of small-to-moderate earthquakes (Mw 3.5–5.0) in diverse tectonic settings, including locations in the United States (Oklahoma, South Carolina, California, Utah, etc.) and China (Sichuan, Shandong). Results demonstrate that reliable focal depth, with uncertainty of 1–2 km, can be determined even with one or a few seismic stations. This highlights the applicability of the method in scenarios characterized by sparse network coverage or historical events.