From Deep Neural Language Models to LLMs

Andrei Kucharavy

doi:10.1007/978-3-031-54827-7_1

From Deep Neural Language Models to LLMs

Andrei Kucharavy

Open Access

https://doi.org/10.1007/978-3-031-54827-7_1

Copy DOI

Publication Date: Jan 1, 2024
Citations: 1	License type: CC BY 4.0

#Deep Neural Language Models #Deep Neural Models + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

AbstractLarge Language Models(LLMs) are scaled-up instances of Deep Neural Language Models—a type ofNatural Language Processing(NLP) tools trained withMachine Learning(ML). To best understand how LLMs work, we must dive into what technologies they build on top of and what makes them different. To achieve this, an overview of the history of LLMs development, starting from the 1990s, is provided before covering the counterintuitive purely probabilistic nature of the Deep Neural Language Models, continuous token embedding spaces, recurrent neural networks-based models, what self-attention brought to the table, and finally, why scaling Deep Neural Language Models led to a qualitative change, warranting a new name for the technology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.