CLARIN Resource Families: Manually Annotated Corpora

Submitted by Linda Stokman on 11 November 2020

The CLARIN Resource Families initiative provides a user-friendly overview of the available language resources in the CLARIN infrastructure for researchers from digital humanities, social sciences and human language technologies.

This month CLARIN highlights manually annotated corpora. These corpora are collections of texts containing manually validated or manually assigned linguistic information, such as morphosyntactic tags, lemmas, syntactic parses, named entities etc. They can be used to train new language annotation tools as well as to test the accuracy of existing annotation tools.