Respiratory rate has been identified as a promising metric for field-based sports monitoring. While respiratory metrics such as minute ventilation (VE) are commonly used in lab-based metabolic tests, they have yet to be implemented in contact sports which encounter physical impact. This study proposes that breathing can be captured on-field via acoustic sensors embedded inside existing sports gears. Two suitable locations for such a system have been identified, either as a smart mouthguard or an instrumented headgear. The signal-to-noise ratio (SNR) at these placements and their potential for respiratory rate estimation will be compared in this study. Four participants were recruited, and respiratory data were captured in both indoor and outdoor settings. A fast-Fourier transform (FFT) based frequency domain analysis was used to estimate the respiratory rate and determine the breathing rate accuracy. A Wilcoxon signed-rank test was carried out to compare the datasets from the two sensor placements. It was found that the SNR of the oral placement is significantly better than the head placement for both indoor and outdoor settings (indoor: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$P = 5.73 \times 10^{-7}, z = 5$</tex-math></inline-formula> ; outdoor: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$P = 2.12 \times 10^{-7}, z = 5.19$</tex-math></inline-formula> ). It was also found that the oral placement had a significantly smaller error compared to the head location when predicting respiratory rate (indoor: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$P = 1.72 \times 10^{-6}, z = -4.78$</tex-math></inline-formula> ; outdoor: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$P = 1.06 \times 10^{-7}, z = 5.32$</tex-math></inline-formula> ). In addition, a convolutional neural network (CNN) based classifier was trained to clean up any non-respiratory sounds from recorded audio, which subsequently achieved a 90% test accuracy. This shows is a promising result for demonstrating the viability of a fully automated oral-based respiratory rate monitoring system.