Background The rapid advancements in natural language processing have brought about the widespread use of large language models (LLMs) across various medical domains. However, their effectiveness in specialized fields, such as naturopathy, remains relatively unexplored. Objective The study aimed to assess the capability of freely available LLM chatbots in providing naturopathy consultations for various types of diseases and disorders. Methods Five free LLMs (viz., Gemini, Copilot, ChatGPT, Claude, and Perplexity) were used to converse with 20 clinical cases (simulation of real-world scenarios). Each case had the case details and questions pertinent to naturopathy. The responses were presented to three naturopathy doctors with > 5 years of practice. The answers were rated by them on a five-point Likert-like scale for language fluency, coherence, accuracy, and relevancy. The average of these four attributes is termed perfection in his study. Results The overall score of the LLMs were Gemini 3.81±0.23, Copilot 4.34±0.28, ChatGPT 4.43±0.2, Claude 3.8±0.26, and Perplexity 3.91±0.28 (ANOVA F [3.034, 57.64] = 33.47, P <0.0001. Together, they showed overall ~80% perfection in consultation. The average measure intraclass correlation coefficient among the LLMs for the overall score was 0.463 (95% CI = -0.028 to 0.76), P = 0.03. Conclusion Although the LLM chatbots could help in providing naturopathy and yoga treatment consultation with approximately an overall fair level of perfection, their solution to the user varies across different chatbots and there was very low reliability among them.