BackgroundArtificial intelligence (AI) tools, although beneficial for data collection and analysis, can also facilitate scientific fraud. AI detectors can help resolve this problem, but their effectiveness depends on their ability to track AI progress. In addition, many methods of evading AI detection exist and their constantly evolving sophistication can make the task more difficult. Thus, from an AI-generated text, we wanted to: (1) evaluate the AI detection sites on a text generated entirely by the AI, (2) test the methods described for evading AI detection, and (3) evaluate the effectiveness of these methods to evade AI detection on the sites tested previously. HypothesisNot all AI detection tools are equally effective in detecting AI-generated text and some techniques used to evade AI detection can make an AI-produced text almost undetectable. Materials and methodsWe created a text with ChatGPT-4 (Chat Generative Pre-trained Transformer) and submitted it to 11 AI detection web tools (Originality, ZeroGPT, Writer, Copyleaks, Crossplag, GPTZero, Sapling, Content at scale, Corrector, Writefull et Quill), before and after applying strategies to minimise AI detection. The strategies used to minimize AI detection were the improvement of command messages in ChatPGT, the introduction of minor grammatical errors such as comma deletion, paraphrasing, and the substitution of Latin letters with similar Cyrillic letters (a and o) which is also a method used elsewhere to evade the detection of plagiarism. We have also tested the effectiveness of these tools in correctly identifying a scientific text written by a human in 1960. ResultsFrom the initial text generated by the AI, 7 of the 11 detectors concluded that the text was mainly written by humans. Subsequently, the introduction of simple modifications, such as the removal of commas or paraphrasing can effectively reduce AI detection and make the text appear human for all detectors. In addition, replacing certain Latin letters with Cyrillic letters can make an AI text completely undetectable. Finally, we observe that in a paradoxical way, certain sites detect a significant proportion of AI in a text written by a human in 1960. DiscussionAI detectors have low efficiency, and simple modifications can allow even the most robust detectors to be easily bypassed. The rapid development of generative AI raises questions about the future of scientific writing but also about the detection of scientific fraud, such as data fabrication. Level of evidenceIII Control case study.
Read full abstract