Abstract

In this lesson you will first learn what topic modeling is and why you might want to employ it in your research. You will then learn how to install and work with the MALLET natural language processing toolkit to do so.

Highlights

  • This lesson requires you to use the command line

  • If you have no previous experience using the command line you may find it helpful to work through the Programming Historian Bash Command Line lesson

  • (We would like to thank Robert Nelson and Elijah Meeks for hints and tips in getting MALLET to run for us the first time, and for their examples of what can be done with this tool.)

Read more

Summary

Lesson Goals

In this lesson you will first learn what topic modeling is and why you might want to employ it in your research. You will learn how to install and work with the MALLET natural language processing toolkit to do so. We will run the topic modeller on some example files, and look at the kinds of outputs that MALLET installed. This will give us a good idea of how it can be used on a corpus of texts to identify topics found in the documents without reading them individually. (We would like to thank Robert Nelson and Elijah Meeks for hints and tips in getting MALLET to run for us the first time, and for their examples of what can be done with this tool.)

What is Topic Modeling And For Whom is this Useful?
Installing MALLET
User Profiles Desktop settings related to your logon
Running MALLET using the Command Line
Mac Instructions
Typing in MALLET Commands
Working with data
Importing data
For Mac
Issues with Big Data
Your first topic model
If when you ran the topic model routine you had included
The composition of your documents
Getting your own texts into MALLET
Further Reading about Topic Modeling
Suggested Citation
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.