ParlaCLARIN@LREC2018 | CLARIN ERIC

Monday, 7 May 2018 , 00:00

The 2018 ParlaCLARIN workshop is held in Miyazaki (Japan), as part of the 11th edition of the Language Resources and Evaluation Conference (LREC2018).

Workshop Description

Parliamentary data is a major source of socially relevant content. It is available in ever larger quantities, is multilingual, has rich metadata, and has the distinguishing characteristic that it is essentially a transcription of spoken language produced in controlled circumstances, which is now increasingly released also in audio and video formats. All those factors in combination require solutions related to its archiving, structuring, synchronization, visualization, querying and analysis. Furthermore, adequate approaches to its exploitation also have to take into account the need of researchers from vastly different Humanities and Social Sciences fields, such as political sciences, sociology, history, and psychology.

An inspiring CLARIN-PLUS cross-disciplinary workshop “Working with parliamentary records” [1] that was held in Sofia, Bulgaria, in Spring 2017, and a comprehensive overview of a multitude of the existing parliamentary resources within the CLARIN infrastructure [2] clearly indicated a need for better harmonization, interoperability and comparability of the resources and tools relevant for the study of parliamentary discussions and decisions, not only in Europe but worldwide.

This workshop aims to bring together researchers interested in compiling, annotating, structuring, linking and visualising parliamentary records that are suitable for research in a wide range of disciplines in the Humanities and Social Sciences. We invite unpublished original work focusing on the collection, analysis and processing of parliamentary records.

Objective

Due to Freedom of Information Acts that are supported by the United Nations and set in place in over 100 countries worldwide, parliamentary debates are being increasingly easy to obtain, and have always been of interest to researchers from a wide range fields in Humanities and Social Sciences both for the potential influence of their content, and the specificities of the formalized, often persuasive and emotional language use in this context. As a consequence, there are many initiatives, on the national and international levels, that aim at compiling and analysing parliamentary data. Recent CLARIN-PLUS survey on parliament data has identified over 20 corpora of parliamentary records, with over half of them being available within the CLARIN infrastructure [3].

Given the maturity, variety, and potential of this type of language data as well as the rich metadata it is complemented with, it is urgent to gather researchers both from the side of those producing parliamentary corpora and making them available, as well as those making use of them for linguistic, historical, political, sociological etc. research in order to share methods and approaches of compiling, annotating and exploring them in order to achieve harmonization of the compiled resources, and to ensure current and future comparability of research on national datasets as well as promote transnational analyses.

Topics of interest

Topics include but are not limited to:

Creation and annotation of parliamentary data in textual and/or spoken format
Annotation standards and best practices for parliamentary corpora
Accessibility, querying and visualisation of parliamentary data
Text analytics, semantic processing and linking of parliamentary data
Parliamentary corpora and multilinguality
Studies based on parliamentary corpora

Proceedings, photo and video recordings

The workshop proceedings can be found on the ELRA website: link to proceedings.
To be cited as: Fišer, D., Eskevich, M. and de Jong, F. (eds). Proceedings of LREC2018 Workshop ParlaCLARIN: Creating and Using Parliamentary Corpora. ELRA, 2018. (ISBN: 978-0-306-40615-7 EAN: 4 003994 155486). Bibtex.
Recordings of all the presentations can be found on CLARIN VideoLectures channel.
Photos from the event can be found on CLARIN official Flickr stream.

Programme

9:00 – 9:15	Welcome and introduction [pdf] CLARIN resources for parliamentary discourse research. Darja Fišer, Jakob Lenardič [pdf]
9:15 - 10:30	Session 1: Creating parliamentary corpora 1.1. SlovParl 2.0: The Collection of Slovene Parliamentary Debates from the Period of Secession. Andrej Pančur, Mojca Šorn, Tomaž Erjavec [pdf] 1.2. Polish Parliamentary Corpus. Maciej Ogrodniczuk [pdf] 1.3. ParlAT beta Corpus of Austrian Parliamentary Record. Tanja Wissik, Hannes Pirker [pdf] 1.4. A Corpus of Grand National Assembly of Turkish Parliament’s Transcripts. Onur Güngör, Mert Tiftikci, Çağıl Sönmez [pdf]
10:30 - 11:00	Coffee break
11:00 - 12:00	Keynote talk Applying Multi-Perspective Approaches to the Analysis of Parliamentary Data by Cornelia Ilie [pdf]
12:00 - 13:00	Session 2: Enriching parliamentary corpora 2.1. UKParl: A Semantified and Topically Organized Corpus of Political Speeches. Federico Nanni, Mahmoud Osman, Yi-Ru Cheng, Simone Paolo Ponzetto, Laura Dietz [pdf] 2.2. EuroParl-UdS: Preserving and Extending Metadata in Parliamentary Debates. Mihaela Vela, Elke Teich and Alina Karakanta [pdf] 2.3. Annotation of the Corpus of the Saeima with Multilingual Standards. Roberts Darģis, Ilze Auziņa, Uldis Bojārs, Pēteris Paikens, Artūrs Znotiņš [pdf] 2.4. A Sentiment-labelled Corpus of Hansard Parliamentary Debate Speeches. Gavin Abercrombie and Riza Batista-Navarro [pdf]
13.00 – 14.00	Lunch break
14.00 - 15:00	Session 3: Parliamentary data in computational social sciences 1 3.1. Automatically Labeled Data Generation for Classification of Reputation Defence Strategies. Nona Naderi and Graeme Hirst [pdf] 3.2. Exploring the Political Agenda of the Greek Parliament Plenary Sessions. Dimitris Gkoumas, Maria Pontiki, Konstantina Papanikolaou, Haris Papageorgiou [pdf] 3.3. Findings from the Hackathon on Understanding Euroscepticism Through the Lens of Textual Data. Federico Nanni, Goran Glavaš, Simone Paolo Ponzetto, Sara Tonelli, Nicolò Conti, Ahmet Aker, Alessio Palmero Aprosio, Arnim Bleier, Benedetta Carlotti, Theresa Gessler, Tim Henrichsen, Dirk Hovy, Christian Kahmann, Mladen Karan, Akitaka Matsuo, Stefano Menini, Dong Nguyen, Andreas Niekler, Lisa Posch, Federico Vegetti, Zeerak Waseem, Tanya Whyte, Nikoleta Yordanova [pdf]
15.00 - 16.00	Panel: Infrastructural Support for Research on Parliamentary Data Panelists: Jan Odijk [pdf], Andreas Blaette [pdf], Federico Nanni [pdf], Cornelia Ilie
16.00 – 16.30	Coffee break
16.30 - 17.30	Session 4: Parliamentary data in computational social sciences 2 4.1. A Pilot Gender Study of the Danish Parliament Corpus. Dorte Haltrup Hansen, Costanza Navarretta, Lene Offersgaard [pdf] 4.2. The Parliamentary Debates as a Resource for the Textometric Study of the French Political Discourse. Sascha Diwersy, Francesca Frontini, Giancarlo Luxardo [pdf] 4.3. Using Data Packages to Ship Annotated Corpora of Parliamentary Protocols: The GermaParl R Package. Andreas Blaette [pdf]
17.30 - 18:00	Closing remarks
20.00 – 22.00	Workshop dinner

Organizing Committee

Darja Fišer, The Faculty of Arts, University of Ljubljana, Slovenia
Franciska de Jong, CLARIN ERIC, The Netherlands
Maria Eskevich, CLARIN ERIC, The Netherlands

The workshop is supported by the CLARIN research infrastructure. To contact the organizers, please mail clarin [at] clarin.eu (Subject: [ParlaCLARIN@LREC2018]).

Programme Committee

in alphabetical order:

Darius Amilevičius, Vytautas Magnus University, Lithuania
Ilze Auziņa, University of Latvia, Latvia
Kaspar Beelen, University of Amsterdam, The Netherlands
Andreas Blätte, University of Duisburg-Essen, Germany
Anastasia Deligiaouri, Western Macedonia University of Applied Sciences, Greece
Griet Depoorter, Dutch Language Institute, Belgium
Francesca Frontini, Université Paul Valéry - Montpellier, France
Katerina T. Frantzi, University of the Aegean, Greece
Maria Gavriilidou, ILSP/Athena RC, Greece
Goran Glavaš, University of Mannheim, Germany
Barbora Hladka, Charles University, Czech Republic
Laura Hollink, Centrum Wiskunde & Informatica, The Netherlands
Caspar Jordan, Swedish National Data Service, Sweden
Martijn Kleppe, National Library of the Netherlands, The Netherlands
Krister Lindén, University of Helsinki, Finland
Bente Maegaard, University of Copenhagen, Denmark
Maarten Marx, University of Amsterdam, The Netherlands
Karlheinz Moerth, Austrian Academy of Sciences, Austria
Monica Monachini, National Research Council of Italy, Italy
Federico Nanni, University of Mannheim, Germany
Jan Odijk, Utrecht University, The Netherlands
Petya Osenova, IICT-BAS and Sofia University "St. Kl. Ohridski", Bulgaria
Simone Paolo Ponzetto, University of Mannheim, Germany
Wim Peters, University of Strathclyde, UK
Stelios Piperidis, Athena RC/ILSP, Greece
Valeria Quochi, National Research Council of Italy, Italy
Ineke Schuurman, KU Leuven, Belgium
Inguna Skadiņa, University of Latvia, Latvia
Sara Tonelli, Fondazione Bruno Kessler, Italy
Jurgita Vaičenonienė, Vytautas Magnus University, Lithuania
Tamás Váradi, Hungarian Academy of Sciences, Hungary
Tanja Wissik, Austrian Academy of Sciences, Austria
Martin Wynne, Bodleian Libraries, University of Oxford, UK

Address

Phoenix Seagaia Resort
Room "Tenju"
Miyazaki, Miyazaki
Japan