In this study, we compare the performance of five chatbots using large language models (LLMs) in handling web development tasks. Three human testers asked each chatbot nine predefined questions related to creating a simple website with a dynamic form and database integration. The questions covered tasks such as generating a web document structure, designing a layout, creating a form, and implementing database queries. The chatbots’ outputs were ranked based on accuracy, completeness, creativity, and security. The experiment reveals that conversational chatbots are adept at managing complex tasks, while programming assistants require more precisely formulated tasks or the ability to generate new responses to address irrelevant outputs. The findings suggest that conversational chatbots are more capable of handling a broader range of web development tasks with minimal supervision, whereas programming assistants need more precise task definitions to achieve comparable results. This study contributes to understanding the strengths and limitations of various LLM-based chatbots in practical coding scenarios, offering insights for their application in web development.
Read full abstract