While a range of freely available tools and excellent tutorials have made it easier to apply computational text analysis techniques, researchers may still find themselves struggling with questions about how to build their corpus and interpret their results. This course will approach text analysis from object to presentation. It covers not just the moment of feed-machine-text-get-results-back, but the process of managing materials and grappling with the meaning of results. Our class will be as much about the decisions and practices of text mining as about tools or step-by-step processes.
Students who take this course will be able to:
- Find and prepare texts for analysis.
- Store, access, and document their text objects and data.
- Discuss their corpus-building decisions and textual data in ways that are methodologically and disciplinarily sound.
- Identify appropriate text analysis methods for a given question.
- Engage in text analysis methods that use word frequency, word location, and natural language processing.
- Articulate statistical, computational, and linguistic principles — and how they intersect with humanistic approaches to texts — for a few text analysis methods.
- Present the results of their computational work to non-experts.
We will use primarily off-the-shelf tools that you can download or access for free (though we will have one section that will make use of R or Python). In some parts of the course, you will be able to develop your own materials; however, we will primarily work together from shared data sets that the instructors will provide. This course will be appropriate for people at all levels of technical expertise. Students should have administrative rights to load R and other software on their laptop.
College of Liberal Arts Building 1.302C