Skip to main content

Annual Conference Overview |  Programme |  Registration 


There will be a large number of people who are involved in building or using CLARIN resources at the 10th Language Resources and Evaluation Conference, to be held 23-28 May 2016 in Portorož, Slovenia. Below is a snapshot of some of the workshops and papers with which they are involved. You can also see the latest news from CLARIN at LREC on the Twitter feed Tweets by @CLARINERIC.


ID Workshop title CLARIN contact
W5 Cross-Platform Text Mining and Natural Language Processing Interoperability Richard Eckart de Castilho (Technische Universität Darmstadt, Germany)
W18 Translation evaluation – From fragmented tools and data sets to an integrated ecosystem Jan Hajic (Charles Unviersity, Czech Republic)
W31 Improving Social Inclusion using : Tools and resources Ineke Schuurman (University of Leuven, Belgium)
W9 Resources and ProcessIng of linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric impairments Jens Edlund (KTH - Royal Institute of Technology, Sweden)
W42 Legal Issues Erik Ketzan & Andreas Witt (Institute für Deutsche Sprache, Mannheim, German), Stelios Piperidis (Athena Research Center/ILSP, Athens, Greece)
W23 4Real - Research Results Reproducibility and Resources Citation in Science and Technology of Language António Branco (University of Lisbon)
W36 Normalisation and Analysis of Social Media Texts (NormSoMe) Andrius Utka (Vytautas Magnus University, Kaunas)


19 A corpus of images and text in online news Laura Hollink, Adriatik Bedjeti, Martin van Harmelen and Desmond Elliott
73 VPS-GradeUp: Graded Decisions on Usage Patterns Baisa Vít, Cinková Silvie, Krejčová Ema, Vernerová Anna
104 Falling silent, lost for words ... Tracing personal involvement in interviews with Dutch war veterans Henk van den Heuvel and Nelleke Oostdijk
223 Curation of Dutch Regional Dictionaries Nicoline van der Sijs, Eric Sanders, Henk van den Heuvel and Aukje Borkent
306 The SemDaX corpus – sense annotations with scalable sense inventories Bolette Pedersen, Anna Braasch, Anders Johannsen, Héctor Martínez Alonso, Sanni Nimb, Sussi Olsen, Anders Søgaard and Nicolai Hartvig Sørensen
337 South African National Centre for Digital Language Resources Justus Roux
348 Universal Dependencies v1: A Multilingual Treebank Collection Nivre Joakim, de Marneffe Marie-Catherine, Ginter Filip, Goldberg Yoav, Hajič Jan, Manning Christopher, McDonald Ryan, Petrov Slav, Pyysalo Sampo, Silveira Natalia, Tsarfaty Reut, Zeman Daniel
361 Corpus-based diacritic restoration for South Slavic languages Nikola Ljubešić, Tomaž Erjavec and Darja Fišer
362 AfriBooms: An online treebank for Afrikaans Liesbeth Augustinus, Peter Dirix, Daniel Van Niekerk, Ineke Schuurman, Vincent Vandeghinste, Frank Van Eynde and Gerhard Van Huyssteen
401 CINTIL DependencyBank PREMIUM - A corpus of grammatical dependencies for Portuguese Rita de Carvalho, Andreia Querido, Marisa Campos, Rita Valadas Pereira, João Silva and António Branco
419 Using a Language Technology Infrastructure for German in order to Anonymize German Sign Language Corpus Data Thomas Hanke
476 FLAT: constructing a CLARIN compatible home for language resources Menzo Windhouwer, Marc Kemps-Snijders, Paul Trilsbeek, André Moreira, Bas Van der Veen and Guilherme Silva
486 Poly-GrETEL: Cross-Lingual Example-based Querying of Syntactic Constructions Liesbeth Augustinus, Vincent Vandeghinste and Tom Vanallemeersch
502 The BAS speech data repository Uwe Reichel, Florian Schiel, Thomas Kisler and Christoph Draxler
506 Graded and Word-Sense-Disambiguation decisions in Corpus Pattern Analysis: a pilot study Baisa Vít, Cinková Silvie, Krejčová Ema, Vernerová Anna
526 CLARIAH in the Netherlands Jan Odijk
572 European Union Language Resources in Sketch Engine Vít Baisa, Jan Michelfeit and Marek Medveď
596 OCR post-correction evaluation of Early Dutch Books Online -- revisited Martin Reynaert
613 Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries Marta Villegas, Maite Melero, Núria Bel and Jorge Gracia
668 BAS Speech Science Web Services - an Update of Current Developments Thomas Kisler, Uwe Reichel, Florian Schiel, Christoph Draxler, Bernhard Jackl and Nina Pörner
709 If You Even Don't Have a Bit of Bible: Learning Delexicalized POS Taggers Yu Zhiwei, Mareček David, Zeman Daniel, Žabokrtský Zdeněk
766 Fostering the Next Generation of European Language Technology: Recent Developments – Emerging Initiatives – Challenges and Opportunities Georg Rehm, Jan Hajic, Josef van Genabith and Andrejs Vasiļjevs
783 Facilitating metadata interoperability in CLARIN-DK Lene Offersgaard and Dorte Haltrup Hansen
811 Corpus vs. lexicon supervision in morphosyntactic tagging: the case of Slovene Nikola Ljubešić and Tomaž Erjavec
880 The Public License Selector: 
Making Open Licensing Easier Pawel Kamocki, Pavel Straňák and Michal Sedlák
887 Towards Comparability of Linguistic Graph Banks for Semantic Parsing Oepen Stephan, Kuhlmann Marco, Miyao Yusuke, Zeman Daniel, Cinková Silvie, Flickinger Dan, Hajič Jan, Ivanova Angelina, Urešová Zdeňka
936 Czech Legal Text Treebank 1.0 Vincent Kríž, Barbora Hladká, Zdeňka Urešová
990 CLARIN-EL Web-based Annotation Tool Ioannis Manousos Katakis, Georgios Petasis and Vangelis Karkaletsis
1012 QTLeap WSD/NED corpora: Semantic annotation of parallel corpora in six languages Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Martin Popel, Kiril Simov and Eneko Agirre
1070 NLP Infrastructure for the Lithuanian Language Daiva Vitkutė-Adžgauskienė, Andrius Utka, Darius Amilevičius and Tomas Krilavičius
1131 MWEs in Treebanks: From Survey to Guidelines Victoria Rosén, Koenraad De Smedt, Gyri Smørdal Losnegaard, Eduard Bejček, Agata Savary, Adam Przepiórkowski and Verginica Mitetelu
1141 Improving corpus search via parsing Natalia Klyueva and Pavel Straňák
1137 Corpus Query Lingua Franca (CQLF) Piotr Banski, Elena Frick and Andreas Witt
1150 Providing a Catalogue of Language Resources for Commercial Users Bente Maegaard, Lina Henriksen, Andrew Joscelyne, Vesna Lusicky, Margaretha Mazura, Sussi Olsen, Claus Povlsen, and Philippe Wacker
1154 Corpus Analysis based on Structural Phenomena in Texts: Exploiting TEI Encoding for Linguistic Research Susanne Haaf
1236 Hidden resources – strategies to acquire and exploit potential spoken language resources in national archives Jens Edlund and Joakim Gustafson

The full list of papers and workshops for the conference can be seen on the LREC2016 proceedings website.