Sport video analysis facilitates the discovery of semantic structures in sport broadcast videos and enables a wide spectrum of applications. For example, coaches can analyze offensive and defensive plays performed during games to assess a team’s capabilities. In general, identifying interested shots, e.g. pitch shots, from broadcast baseball videos requires great human labor to browse through those videos. In this work, we proposed a novel technique that automatically extracts pitch-by-pitch shots by recognizing the reliable emergence of pitching speed displayed on the scoreboard, estimating when and where the pitcher is present, and identifying the pitch shots based on the pitcher’s motion degree. To validate the performance and accuracy of the proposed technique, we collected a dataset of baseball videos broadcasted in various countries. The experimental results verify that the proposed technique successfully extracts the desired pitch-by-pitch videos. Furthermore, it outperforms the state-of-the-art approach in terms of accuracy and time complexity.