Instructors

    Warning: Undefined variable $page_object in /home/dhtraining/public_html/hilt/wp-content/themes/hilt-child/single-hilt_course.php on line 84

  • Katie Rawson

    Director of Learning Innovation University of Pennsylvania Libraries

  • Scott Enderle

    Digital Humanities Specialist University of Pennsylvania Libraries

Description

This class will examine methods and practices for text analysis. Freely available tools and excellent tutorials have made it easier to apply computational text analysis techniques; however, researchers may still find themselves struggling with how to build their corpus, decide upon a method, and interpret results. We will survey the how and why of variety of commonly used methods (e.g. word distribution, topic modeling, natural language processing) as well as how develop and manage a collection of texts.

Students who take this course will be able to:

  • Find and prepare texts for analysis.
  • Store, access, and document their text objects and data.
  • Discuss their corpus-building decisions and textual data in ways that are methodologically and disciplinarily sound.
  • Identify appropriate text analysis methods for a given question.
  • Engage in text analysis methods that use word frequency, word location, and natural language processing.
  • Articulate statistical, computational, and linguistic principles — and how they intersect with humanistic approaches to texts — for a few text analysis methods.

We will use a mixture of free off-the-shelf tools and scripts in R and Python (you don’t need to know R or Python to take the class). We will primarily work together from shared data sets the instructors will provide. This course will be appropriate for people at all levels of technical expertise. Students should have administrative rights to load software on their laptop.

Location

Room 627, Kislak Center, Van Pelt Library