Written by Silvia Calamai, Letizia Cirillo, Debora Forconi, Roberta Bianca Luzietti, Rosalba Nodari and Duccio Piccardi about CLARIN workshop ‘Privacy by Design in Research’ held by Henk van den Heuvel on May 24, 2023.
From 24 to 26 May 2023, the Arezzo campus of the University of Siena hosted the 23rd International Congress of the Italian Association of Applied Linguistics (AItLA 2023). The topic of the conference was ‘Contexts, Practices and Resources of Multimodal Communication’. Around 70 academics from all over Europe attended the conference, including young researchers.
The two keynote lectures were by Simona Pekarek Doehler (‘Why Study SLA from a Multimodal Perspective? The Case of Interactional Competence’) and Henk van den Heuvel (‘GDPR - Compliant Storage of Sensitive Data and the CLARIN K-Centre for Atypical Communication Expertise’).
During the conference, special attention was paid to how different semiotic resources contribute to the construction of meaning in different contexts of use. Specifically, the role of multimodality was explored within ordinary conversation, institutional interaction, and with reference to (second) language acquisition.
The Workshop
The opening event was the CLARIN workshop ‘Privacy by Design in Research’, held by Henk van den Heuvel (CLARIN K-Centre ACE and DELAD CLST, Radboud University, the Netherlands).
The aim of the workshop was to show young researchers how to perform a Data Protection Impact Assessment (DPIA) for an innovative research scenario to enable responsible re-use of archived speech corpora. Participants learned both on a theoretical and practical level which elements and stakeholders should be considered in a DPIA, and how to comply with the guidelines developed within the DELAD initiative.
The workshop began with an introduction on how to account for privacy from the very design stage of research. Van den Heuvel clarified what personal data is, explained the use of direct and indirect identifiers, discussed the purposes of the GDPR, listed special categories of personal data, and provided a set of guidelines to be employed as privacy-enhancing measures.
At the end of the introductory session, Silvia Calamai briefly illustrated an Italian initiative for the preservation of oral archives, a ‘Vademecum’ jointly written by different subjects from academia, the main agencies of the Ministry of Cultural Heritage, private and public institutions, and CLARIN-IT.
The second part of the workshop consisted of the discussion of a use case: a language acquisition and attitude project planning to gather together linguistic autobiographies from a cohort of underage students. The workshop participants were divided into groups of four and asked to discuss the processing, sharing and reuse of personal and special data.
The third and last part of the workshop consisted in working with a second use case: a European project employing audiovisual recordings of multiparty interactions in a clinical setting. For this activity, participants were asked to play the roles of the different stakeholders of a DPIA session, i.e. the researcher, the data subject, the ethics committee member, the legal expert, the IT/security expert, and the data manager of the archive, thus putting into practice what they had just learnt.
The Participants
The workshop was attended by 12 people, including MA and PhD students, young researchers and oral sources specialists with different linguistics backgrounds (e.g., phonetics, second language acquisition, speech analysis, computational linguistics, conversation analysis, and oral history), from different universities and institutions in Italy, such as the Universities of Pisa, Bolzano, Naples l’Orientale, Modena and Reggio Emilia, and the Italian National Research Council.
Testimonials
Federico Corradini: ‘[...] The topic of privacy and European regulations [are] fundamental in developing the skills of an early career researcher interested in the use of oral materials, although still little systematically addressed in methodological publications [...]. I think that being aware of the data protection regulations, tools and guidelines is useful not only for participating in research projects already underway [...], but also in developing the know-how necessary to plan data collections or write future projects that take into account the privacy aspects of participants to facilitate the recording, storage, sharing and reuse of research data.’
Federico Corradini’s research focuses on the analysis of audio and video recordings of naturally occurring interactions from different settings, including sports commentary and online video gaming. He is currently working on the creation of a digitised archive of interpreter-mediated interactions collected in healthcare and school settings, further developing long-term research carried out by members of the AIM interuniversity Center (University of Modena and Reggio Emilia unit).
Daler Tashkhuzhaev: ‘This workshop helped me to critically analyse the measures for personal data protection implemented in my research. It made me understand the potential weaknesses of the protocol and the possible ways to overcome them. Particularly, I found that pseudonymisation could be a very useful tool to protect the privacy of the most sensitive categories of participants [...]. I also really appreciated the opportunity to discuss with my colleagues the issues [...] in the form of a role-play.’
Daler Tashkhuzhaev graduated in linguistics at the University of Rome ‘La Sapienza’. He is now a PhD student in linguistics at the University of Pisa, working on the possible implications of the Unaccusative Hypothesis on the acquisition of verbal syntax of Italian L2 by native Russian speakers.
Maria Paola Noschese, Marika Lamberti and Sara Marsiglia: 'This workshop was very useful for us and for our project ‘Ritratti e racconti’, which is based on written and oral linguistic autobiography. In fact, due to the newly acquired skills, we need to change something in our database. The workshop is also, and especially, suitable for people who do not have much knowledge in the area of privacy in research. The role-playing was also very interesting and dynamic and gives a concrete understanding of the issues of using personal data in research.’
Maria Paola Noschese, Marika Lamberti and Sara Marsiglia are MA students from the University of Naples L’Orientale. They work on a project entitled ‘Ritratti e racconti. Rappresentazioni multimodali di repertori linguistici plurali di immigrati adulti’, which aims at investigating multilingual education and translanguaging practices.