Skip to main content

CLARIN Café on Computer-Assisted Pragmatic Annotation of Native and Learner Corpora

, -


General Information

This CLARIN Café is organised by Nicola Brocca, Joseph Wang-Kathrein, Eva Maria Hirzinger-Unterrainer (Universität Innsbruck), Elena Nuzzo and Diego Cortés Velásquez (Università Roma Tre) in collaboration with the Austrian Academy of Sciences. The café will be hosted by Francesca Frontini.
  • Date: 12/03/2024
  • Time: 14:00 - 16:00 (CEST)
  • Venue: CLARIN virtual Zoom meeting
  • Twitter hashtag: #CLARINcafe
A full overview of planned CLARIN Café sessions can be found on the CLARIN Café page.


The corpora DisDir and Ladder consist of elicited speech-acts of cancellation and request in Italian L1 and L2, German, and Colombian Spanish. They have been collected and partially manually annotated for pragmatic research in transcultural pragmatics and second language education. Based on the aforementioned data, the project LadderWeb aims to:

  1. Train a machine-learning based software programmed for the automatic annotation of pragmatic categories in requests and cancellation in Italian L1 and L2 and the other aforementioned languages.
  2. Annotate part of the corpus with AI support and archive all the elicited data in the ARCHE CLARIN repository.
  3. Make the corpus accessible and queryable for learners and practitioners through a web interface.

Unlike previous attempts at pragmatic annotation, the LadderWeb project is based on elicited data that constrain speech acts, control extratextual contexts and allow the detection of implicit information. This makes it possible to circumvent the traditional problems associated with pragmatic annotation and distinguishes the project as a pioneering achievement in the annotation of native and learner corpora.

This presentation will highlight the steps accomplished to collect and (automatically) annotate the corpus as well as develop the code for the automatic annotation. It will also include a demo and best practice recommendations for educators using the web interface.

How to join

You can register for free using this link in order to receive the meeting room details. 


14:00-14:05  Opening and CLARIN 1-0-1 by Francesca Frontini, Member of the CLARIN Board of Directors

14:05-14:15  DisDir and Ladder: Aims and architecture of the corpora

Diego Cortés Velásquez, Elena Nuzzo (Roma Tre), Nicola Brocca (UIBK)

14:15-14:30  Pragmatic categories and annotation challenges

Nicola Brocca, Maria Rudigier, Valentin Spielthenner (UIBK)

14:30-14:40  Q&A

14:40-14:50  Building a machine-learning based software for the automatic annotation

Joseph Wang-Kathrein (UIBK)

14:50-15:00  Archiving according FAIR principles

Joseph Wang-Kathrein, Nicola Brocca (UIBK)

15:00-15:10  Publication and long time archiving in CLARIN, via the ARCHE repository

Seta Štuhec (ÖAW)

15:10-15:30  LadderWeb: chances for practitioners, learners and researchers

Nicola Brocca, Eva Maria Hirzinger-Unterrainer (UIBK)

15:30-16:00 Q&A