INTRODUCTION: With the advancement in the large language models, often called LLMs, there has been increasing concerns around the usage of these models. As they can generate human-like text and can also perform a number of tasks such as generating code, question answering, essay writing and even generating text for research papers.OBJECTIVES: The generated text is subject to the usage of the original data (using which models are trained) which might be protected or may be personal/private data. The detailed description of such concerns and various potential solutions is discussed in ‘Generative language models and automated influence operations: Emerging threats and potential mitigations’. METHODS: Addressing these concerns becomes the paramount for LLMs usability. There are several directions explored by the researchers and one of the interesting works is around building content aware models. The idea is that the model is aware of the type of content it is learning from and aware what type of content should be used to generate a response to a specific query.RESULTS: In our work we explored direction by applying poisoning techniques to contaminate data and then applying genetic algorithms to extract the non-poisoned content from the poisoned content that can generate a good response when paraphrased.CONCLUSION: While we demonstrated the idea using poisoning techniques and tried to make the model aware of copyrighted content, the same can be extended to detect other types of contents or any other use cases where content awareness is required.