Warning: Undefined variable $page_object in /home/dhtraining/public_html/hilt/wp-content/themes/hilt-child/single-hilt_course.php on line 84

  • Katie Rawson

    Director of Learning Innovation University of Pennsylvania Libraries


This course will take up two methods of text analysis for deeper study and exploration: machine learning and natural language processing (NLP). We will delve into two case studies, one focusing on topic modeling (under the machine learning rubric) and the other on part-of-speech tagging (under the NLP rubric). These methods treat text in two different ways: machine learning applies statistical models to words while NLP uses linguistic models. Through hand-on activities and discussion, we will examine the case studies from question development and data curation to visualization and argument. Along the way, we will consider other machine learning and NLP techniques in the broader text analysis and digital humanities landscape.

This class is targeted towards people with some familiarity with text analysis methods and some familiarity with coding. We will be working in R. If you are a Python user (or a user of any other language), you will be fine in this class. If you have never used a computer language before and would like to be part of this class, I am happy to provide suggestions for tutorials so you can hit the ground running with us. If you have any questions, please contact me.


2115J University Library