Written by Barbara McGillivray
Project Overview
This project was supported by CLARIN via the CLARIN Resource Families funding scheme and ran from November 2022 to June 2023. The project team consisted of Barbara McGillivray (King’s College London) and Fahad Khan (CNR-ILC) as project leads, and Paola Marongiu (University of Neuchâtel) as researcher. The project aimed to define workflows for semantic change research with CLARIN Resource Families (CRF), bringing together existing resources and tools needed to support research in lexical semantic change (LSC), which is the linguistic phenomenon in which words change their meanings over time.
The concept of the workflows was aligned with the CLARIN 2021-2023 strategy in several ways. First, they enable multilingual lexical semantic change research investigating language as a carrier of cultural content and information. Second, it aimed to strengthen CLARIN’s role as a pillar supporting social sciences and humanities (SSH) research given the central role played by LSC in SSH research. Third, it aimed to enable lexical semantic change detection algorithms on multilingual language resources, facilitating advancement in language technology research. Finally, it aimed to improve the discoverability of existing tools and CRFs, specifically those involved in lexical semantic change research, i.e. annotated corpora, dictionaries, language models and algorithms. The figure below shows a high-level diagram of the relation between the new CRF and existing language resources and tools.
Our proposal consisted in following the model of workflows already present in the Social Sciences and Humanities Open Marketplace, the discovery portal part of the Social Sciences and Humanities Open Cloud project (SSHOC) bringing together resources for Social Sciences and Humanities research communities.
Outcomes
The key outcomes of the project were:
- A report summarising the project’s outputs, including the six workflows we propose for the new CRF.
- Barbara McGillivray, Khan, Fahad, & Marongiu, Paola. (2023). A new CLARIN Resource Family for lexical semantic change - Final report. Zenodo. https://doi.org/10.5281/zenodo.8156200
- A survey of the state of the art in lexical semantic annotation of diachronic corpora, including relevant tools and resources for semantic change detection research.
- Marongiu, Paola, Khan, Fahad, & McGillivray, Barbara. (2023). Tools and resources for diachronic lexical semantic analyses: state of the art. Zenodo. https://doi.org/10.5281/zenodo.8103978
- An outline of corpus annotation guidelines for diachronic lexical semantics.
- Paola Marongiu, & Barbara McGillivray. (2023). Preliminary guidelines for manual annotation of word senses in Latin and ancient Greek corpora. Zenodo. https://doi.org/10.5281/zenodo.7576676
- A blog post summarising the project:
- McGillivray, Barbara. (2023). “Designing a new CLARIN Resource Family for semantic change research” (CLARIN-UK blog).
- The paper ‘A New CLARIN Resource Family for Lexical Semantic Change Research’, which was accepted and will be presented at the 2023 CLARIN annual conference on 16-18 October 2023.
- The CLARIN Café online event ‘A New CLARIN Resource Family for Lexical Semantic Change Research’, which was organised on 5 July 2023 to solicit contributions from the CLARIN community and beyond. Both the slides and the tutorial document for the CLARIN Café have been published in Zenodo:
- Marongiu, Paola, Khan, Fahad, & McGillivray, Barbara. (2023, July 1). CLARIN Café ‘A New CLARIN Resource Family for Lexical Semantic Change research’. Zenodo. https://doi.org/10.5281/zenodo.8104009 .