News
The national CLARIN consortium for Hungary, HunCLARIN, joined CLARIN in 2016. The Research Institute for Linguistics was one of the founding partners of CLARIN and took an active role in the preparatory phase of the history of CLARIN.
In this Tour de CLARIN blog post, we present an in-depth interview with Ondřej Tichý, a corpus linguist who is deputy chair of the Department of English Linguistics at the Facuty of Arts at Charles University. Dr Tichý collaborates with and is a regular user of the Czech National Corpus
The Czech National Corpus (CNC) is a long-term academic project with the main aim to continuously map the Czech language by building, annotating and providing access to a variety of large general-purpose corpora. It has been recognized by CLARIN as a Knowledge-Centre in 2018.
In this Tour de CLARIN blog post, we present an in-depth interview with Kaja Dobrovoljc, a Slovenian corpus linguist who works at the Centre for Language Resources and Technologies and regularly collaborates with CLARIN.SI and uses its infrastructure.
In this Tour de CLARIN blog post, we present an in-depth interview with Nan Bernstein Ratner, who is along with Brian MacWhinney one of the PIs of FluencyBank, a shared database for the study of the development of fluency in typical and disordered populations.
CLARIN Slovenia (CLARIN.SI) has contributed to several user involvement events which presented the results of the project to different user groups.
Read about the CSMTiser, a supervised machine learning tool that performs word normalization by using Character-level Statistical Machine Translation.
TalkBank, which was recognized as a CLARIN Knowledge Centre in 2016, is the world’s largest open access integrated repository for spoken language data. It provides language corpora and other audio resources to support researchers in Psychology, Linguistics, Education, Computer Science, and Speech Pathology.
In 2015, researchers from the Jožef Stefan Institute in Ljubljana, Slovenia released the first emoji sentiment lexicon, called Emoji Sentiment Ranking 1.0, and published it as a resource in the public language resource repository CLARIN.SI. With 78,500 downloads to date, the lexicon is the most downloaded resource in the CLARIN.SI repository.
CLARIN.SI joined CLARIN in 2015 and is a B-certified centre which offers a LINDAT/D-Space repository that currently contains around 110 language resources for Slovenian as well as for other languages, especially Croatian and Serbian.