Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models

Jiang Zhang,Konstantinos Psounis,Cheng Cao,Yiming Xu,Qiong Wu,Zheng Du

doi:10.1609/aaai.v38i19.30178

Abstract

Toxic content detection is crucial for online services to remove inappropriate content that violates community standards. To automate the detection process, prior works have proposed varieties of machine learning (ML) approaches to train Language Models (LMs) for toxic content detection. However, both their accuracy and transferability across datasets are limited. Recently, Large Language Models (LLMs) have shown promise in toxic content detection due to their superior zero-shot and few-shot in-context learning ability as well as broad transferability on ML tasks. However, efficiently designing prompts for LLMs remains challenging. Moreover, the high run-time cost of LLMs may hinder their deployments in production. To address these challenges, in this work, we propose BD-LLM, a novel and efficient approach to bootstrapping and distilling LLMs for toxic content detection. Specifically, we design a novel prompting method named Decision-Tree-of-Thought (DToT) to bootstrap LLMs' detection performance and extract high-quality rationales. DToT can automatically select more fine-grained context to re-prompt LLMs when their responses lack confidence. Additionally, we use the rationales extracted via DToT to fine-tune student LMs. Our experimental results on various datasets demonstrate that DToT can improve the accuracy of LLMs by up to 4.6%. Furthermore, student LMs fine-tuned with rationales extracted via DToT outperform baselines on all datasets with up to 16.9% accuracy improvement, while being more than 60x smaller than conventional LLMs. Finally, we observe that student LMs fine-tuned with rationales exhibit better cross-dataset transferability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

Enhancing Information Retrieval in the Drilling Domain: Zero-Shot Learning with Large Language Models for Question-Answering
G Pelfrene ... F J Pacis
-
G Pelfrene, et. al.G Pelfrene ... F J Pacis
27 Feb 2024
27 Feb 2024

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... Bianca Maria Colosimo
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... Bianca Maria Colosimo
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

CancerGPT for few shot drug pair synergy prediction using large pretrained language models
Tianhao Li ... Yejin Kim
npj Digital Medicine | VOL. 7
Tianhao Li, et. al.Tianhao Li ... Yejin Kim
19 Feb 2024
npj Digital Medicine | VOL. 7

Improving the use of LLMs in radiology through prompt engineering: from precision prompts to zero-shot learning.
Fabian Bamberg ... Maximilian Frederik Russe
RöFo - Fortschritte auf dem Gebiet der Röntgenstrahlen und der bildgebenden Verfahren | VOL. -
Fabian Bamberg, et. al.Fabian Bamberg ... Maximilian Frederik Russe
26 Feb 2024
RöFo - Fortschritte auf dem Gebiet der Röntgenstrahlen und der bildgebenden Verfahren | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence