Skip to main content

Component Metadata and Concept Definitions

Metadata for Resources and Services in CLARIN

Metadata plays a key role in making language resources and tools findable and accessible, which is one of CLARIN’s primary objectives. High-quality metadata makes resources and tools discoverable, and helps potential users understand the nature and usability of a resource or tool. 

Data repositories at CLARIN centres provide digital resources as well as the associated metadata. The first interaction users have with a resource is typically with its metadata, either through the user interface of a repository system or elsewhere, thanks to the ease of publishing, distributing, aggregating and reusing metadata based on common standards.

 

 


CMDI: A Flexible Metadata Framework

There is no one-size-fits-all solution for metadata. Different communities and domains have different requirements for metadata, and different types of resources call for different properties, terminology, and so on. Therefore, many different metadata standards exists, which introduces challenges in terms of interoperability and integration. CLARIN addresses these challenges by offering a framework for the modelling, authoring and exploitation of metadata in a standard, yet flexible way. The metadata infrastructure that uses the framework and implements a solution for CLARIN's community is called the Component Metadata Infrastructure (CMDI)

 

Who Uses CMDI?

The ultimate goal of finding, understanding and accessing a resource by means of is metadata is achieved as a result of the actions of contributors acting in a number of distinct roles:

  •  A relatively small number of metadata modellers define so-called metadata profiles and publish these in a central registry. A system of modular building blocks (metadata components) encourages reuse and uniformity. 

  • The resulting profiles can be considered blueprints for metadata records, which are produced by metadata creators, often researchers involved in the creation of the resource itself. Each record describes an individual resource, tool or service. 

  • Repositories are responsible for the storing and publishing of both resources and metadata. By means of standardised methods, CLARIN and other parties can collect the metadata separately from the resources – for instance, to make them searchable via the Virtual Language Observatory ( ).

Users that wish to find, access or (re)use language resources or technology can do so using CLARIN's infrastructure as well as services offered directly by individual CLARIN centres, and in doing so benefit from the advantages of high quality, interoperable metadata without the need to understand any technical specificities about the underlying standards or infrastructure.

 

Why CMDI?

Some of the main advantages of CLARIN's approach to metadata based are:

  • Built on common, broadly supported standards

  • Adaptable to metadata requirements for specific domains, communities and/or resource types

  • A large degree of freedom for the metadata providers and repository managers in terms of chosen technology and workflow

  • Semantic alignment across metadata definitions even where structure, terminology and vocabulary differ by means of concept links (references to common, identifiable semantic entities)

  • Support for various other common metadata standards within CLARIN's central infrastructure.


Technical Details

CLARIN offers services, tools and documentation to help not only metadata modellers and creators, but also infrastructure developers, programmers and repository managers to use to make resources findable, accessible and interoperable.

For more information on the technical details, see the For Infrastructure Developers section.

The CMDI landing page offers in-depth information aimed in particular at CMDI modellers, metadata creators and repository managers.