Skip to main content

Applied Language Technology

This open educational resource consists of a two-course module that provides humanities majors with a basic understanding of language technology and the practical skills needed to apply language technology using Python. The module is intended to empower the students by showing that language technology is accessible and applicable to research in the humanities.

 

Learning Outcomes

By the end of this course, learners will be able to:
  • Use Jupyter Notebooks
  • Write basic Python programming language
  • Apply basic techniques to process, store, annotate and analyse different types of texts
  • Evaluate the performance of NLP algorithms

 

Author(s)

Tuomo Hiippala

Role: Assistant Professor in English Language and Digital Humanities

Department of Languages

Faculty of Arts, University of Helsinki

Finland
 

Description of the Training Materials

(Sub)discipline, topic, language(s)
Language technology, digital humanities
 
Python

English

Keywords

language technology, digital humanities, tutorial, beginner, spaCy, Stanza, Universal Dependencies, introduction

URL (s) to Resource https://applied-language-technology.mooc.fi/html/index.html
YouTube channel
Resource URL Type URL
CLARIN Language Resources

The course materials build on resources distributed through CLARIN, such as Universal Dependencies corpora. The materials refer to the CLARIN website for further study, highlighting the digital humanities course registry.

Structure and Duration

The learning materials constitute a 10 ECTS module comprising two 5-credit courses. The materials are divided into two parts, in which each section corresponds to one week of studying. 

Target Audience

Students of languages, linguistics and communication

Expertise (Skill) Level
Beginner/intermediate
No previous experience in language technology or Python is required.
Facilities Required

All learning materials and their source code are available on GitHub.

The learning materials are rendered from Jupyter Notebooks, and feature a Binder integration so anyone can launch an interactive Jupyter Notebook running in their browser.

Format

The materials adopt a hands-on approach to maintain interest, teaching Python basics and applying basic techniques in natural language processing to diverse texts. In addition, the materials are accompanied by short YouTube videos, which explain the techniques step-by-step. 

University Course(s) in which the materials have been used
 

Working with Text in Python, 5 ECTS

Natural Language Processing for Linguists, 5 ECTS

Licence and (re)use All learning materials, including YouTube videos, have a CC BY-NC 4.0 license. Course exercises are available on request.
Creation Date

October 2020

Last Modification Date May 17, 2021
 

Experience with Using CLARIN Language Resources in Teaching 

CLARIN language resources are crucial, as they are used for training the language models used. However, the intended audience cannot initially engage directly with the resources during the two-course module but interact with them through models provided by Python libraries such as spaCy and Stanza. At the end of the course, however, the students should be able to load and manipulate corpora hosted by CLARIN into Python.
 

Additional Information 

The plan is to develop the learning materials into a MOOC hosted by the University of Helsinki.
 

Cite this Work

Hiippala, Tuomo (2021) Applied Language Technology: NLP for the Humanities. In David Jurgens, Varada Kolhatkar, Lucy Li, Margot Mieskes and Ted Pedersen (eds) Proceedings of the Fifth Workshop on Teaching NLP. Association for Computational Linguistics, 46–48. : 10.18653/v1/2021.teachingnlp-1.5.
 

Contact Information

Teachers who reuse and adapt this training material can share their feedback via training [at] clarin.eu (training[at]clarin[dot]eu)