The software applications included in this resource family allow searching, exploring, analysing and visualizing linguistic corpora and texts. Text and corpus analysis lie at the heart of digital scholarship in the humanities and social sciences, and a wide range of software tools are available in this domain. These software tools represent prime examples of the ways in which language technologies can support research across a range of disciplines, and they are therefore central to CLARIN’s mission.
The resource family includes both applications for installation on the users own computer (desktop) and those accessible via a web browser (online), with some key information about them in order to help users to find them and to choose between them for a particular research goal. A 'corpus analysis tool' is defined here in the sense indicated by the late John Sinclair (and others) that the basic operations of corpus linguistics involve ‘corpus, concordance, collocation’. So we include tools that can at least: deal with a corpus, show concordances, as well as (preferably) calculating frequent collocates.
Most of the tools listed so far can do a lot more than this, including generating word frequency lists and keywords, calculating n-grams and clusters, working with linguistic annotation and descriptive metadata, and producing visualizations of distributions of words and features.
For comments, changes of the existing content or inclusion of new corpora, send us an resource-families [at] clarin.eu (email).
Corpus Query Tools in the CLARIN Infrastructure
Online Query Tools
Tool | Language | Description | |
---|---|---|---|
Functionality: Querying/concordancing, Stylometry |
Arabic, Bosnian, Croatian, Czech, English, French, German, Hebrew, Italian, Japanese, Portuguese, Serbian, Spanish |
This tool constitutes a deployment of Voyant Tools used at SADILAR. CLARIN Centre: SADiLAR |
|
Functionality: Querying/concordancing, corpus upload |
Arabic, Czech, Chinese, English, French, German, Italian, Japanese, Kannada, Lithuanian, Portuguese, Russian, Spanish, Ukrainian |
The Intelligent Tools for Creating and Analysing Electronic Text Corpora for Humanities Research (IntelliText) project aims to facilitate corpus use for academics working in various areas of the humanities. The project produced a user-friendly corpus interface with an array of easy-to-use functions that will benefit teaching and research in several academic disciplines. It is possible to upload one's own corpus with this tool. An online guide is available. CLARIN Centre: CLARIN-UK |
|
Functionality: Querying/concordancing |
Bulgarian |
This is a dedicated concordancer for the Bulgarian National Reference Corpus. CLARIN Centre: CLARIN-BG |
|
Concordancer of the Croatian National Corpus Functionality: Querying/concordancing |
Croatian |
This is an implementation of NoSketchEngine for the Croatian National Corpus. CLARIN Centre: CLARIN-HR |
|
Functionality: Querying/concordancing |
Czech |
KonText is a basic web application for querying corpora available within the LINDAT/CLARIAH-CZ project. It allows evaluation of simple and complex queries, displaying their results as concordance lines, computing frequency distribution, calculating association measures for collocations and further work with language data. This LINDAT/CLARIAH-CZ instance is a fork of KonText application developed by the Institute of the Czech National Corpus that has been further extended by the Institute of Formal and Applied Linguistics to suit the needs of LINDAT/CLARIAH-CZ project. It is possible to upload one's own corpus with this tool. KonText is openly developed. Registration is required and Shibboleth log-in is supported. CLARIN Centre: CLARIAH-CZ
|
|
Functionality: Querying/concordancing |
Danish |
This is a web-based concordancer that can be used for corpus queries based on morphosyntactic analysis and various other features. Registration is required. CLARIN Centre: CLARIN-DK |
|
Concordancer of Corpus Gysseling Functionality: Querying/concordancing |
Dutch |
This is a dedicated query tool for the Corpus Gysseling, developed by the Instituut voor de Nederlandse Taal. The backend of the application is the BlackLab Lucene-based search engine developed for corpora with token-based annotation. The web-based frontend is a further development of the corpus-frontend application developed by INT in CLARIN and CLARIAH projects. CLARIN Centre: CLARIAH-NL |
|
Concordancer of Corpus Middelnederlands Functionality: Querying/concordancing |
Dutch |
This is a dedicated query tool for the Corpus Middelnederlands. CLARIN Centre: CLARIAH-NL |
|
Functionality: Querying/concordancing (treebanks) |
Dutch |
GrETEL stands for Greedy Extraction of Trees for Empirical Linguistics. It is a user-friendly search engine for the exploitation of syntactically annotated corpora or treebanks. It is possible to upload one's own corpus with this tool. CLARIN Centre: CLARIAH-NL
|
|
Functionality: Querying/concordancing |
Dutch |
This is an online research portal for historical texts in the Dutch language. Registration is required and Shibboleth log-in is supported. CLARIN Centre: CLARIAH-NL |
|
Functionality: Querying/concordancing |
Dutch |
This is an online corpus retrieval system that allows for analyzing and searching the SoNaR and CGN corpora. Registration is required and Shibboleth log-in is supported. CLARIN Centre: CLARIAH-NL
|
|
Functionality: Querying/concordancing |
Dutch (17th Century) |
This is a dedicated querying tool for the Couranten Corpus, which comprises the seventeenth-century Dutch newspapers, available on Delpher. CLARIN Centre: CLARIAH-NL |
|
Functionality: Querying/concordancing |
English |
This tool is a modified version of CQPweb for the British National Corpus. It allows a number of search options: publication date, text medium, author gender, target audience, genre, author domicile. Registration is required to use the tool. CLARIN Centre: CLARIN-UK |
|
Functionality: Querying/concordancing |
English |
This tool has been developed as part of the CLiC Dickens project, which demonstrates through corpus stylistics how computer-assisted methods can be used to study literary texts and lead to new insights into how readers perceive fictional characters. Further literary texts have been added to the online service. Technical support is offered through clic [at] contacts.birmingham.ac.uk (email). CLARIN Centre: CLARIN-UK
|
|
Functionality: Querying/concordancing, corpus upload and processing |
English, Spanish |
This tool provides a web interface to the English USAS and CLAWS corpus annotation tools, and standard corpus linguistic methodologies such as frequency lists and concordances. It also extends the keywords method to key grammatical categories and key semantic domains. It is possible to upload one's own corpus with this tool. The tool is free for UK government and academic researchers in countries on the OECD DAC list, £50 per username per year for non commercial research and teaching. Technical support is offered here. CLARIN Centre: CLARIN-UK
|
|
Functionality: Querying/concordancing |
English, Arabic, French, Italian, Norwegian, Polish, Latvian |
This is an online implementation of the CQPweb system with a large number of corpora installed. It is possible to upload one's own corpus with this tool. Note that CQPweb will be superseded by Ziggurat, which is under development. Registration is required to use this tool. CLARIN Centre: CLARIN-UK |
|
Concordancer of the Text Corpus of the Institute of the Estonian Language Functionality: Querying/concordancing |
Estonian |
This tool provides a simple interface for a text corpus. The material for the text corpus has been collected haphazardly, 10.4 million word forms. Approximately 80% of the texts come from newspapers, which is why the corpus is not representative. The corpus also is not tagged, thus being suited for lexical search mainly. CLARIN Centre: CELR |
|
Functionality: Querying/concordancing |
Finnish, Swedish, Russian, English, and more |
This is a web-based concordance tool that can be used for corpus queries based on morphosyntactic analysis and various other features. A large proportion of the corpora in Kielipankki are offered via Korp. User support is available through email. CLARIN Centre: PORTULAN CLARIN
|
|
Functionality: Querying/concordancing |
German |
This tool is used for querying the German reference corpus DeReKo, as well as several other historical and non-historical corpora. Technical support is offered through cosmas2 [at] ids-mannheim.de (email). CLARIN Centre: CLARIN-D
|
|
Functionality: Querying/concordancing |
German |
This is a tool for browsing DWDS corpora. The DWDS is part of the Center for Digital Lexicography of the German Language (ZDL), funded by the Federal Ministry of Education and Research. It is based at the Berlin-Brandenburg Academy of Sciences. CLARIN Centre: CLARIN-D |
|
Functionality: Querying/concordancing |
German |
This is a corpus analysis platform that is suited for large, multiply annotated corpora and complex search queries independent of particular research questions. Registration is required only for license restricted corpora. CLARIN Centre: CLARIN-D
|
|
Functionality: Querying/concordancing |
Hebrew |
This is a dedicated online environment for querying the Hebrew Bible. CLARIN Centre: CLARIAH-NL |
|
Functionality: Querying/concordancing, corpus upload and analysis |
Language independent |
This tool allows users to upload corpora annotated at the token level for (extended) part of speech, lemma and word form in FoLiA or format, after which the corpus can be searched for these properties with a Corpus of Contemporary Dutch-like interface CLARIN Centre: CLARIAH-NL |
|
Functionality: Querying/concordancing (non-parallel and parallel) |
Language independent |
This tool allows text and corpora querying, supporting both basic information retrieval and advanced search. It allows the customization of the query system functionalities and provides indexing also for morpho-syntactically annotated texts. The system can handle several type of text annotations and make concordances also for parallel bilingual corpora. CLARIN Centre: CLARIN-IT |
|
Functionality: Querying/concordancing |
Language independent | This is a corpus management and analysis system for annotated corpora, with sophisticated query language. It is a reimplementation of Corpuscle featuring an improved user experience and many new features that is now available as a Meurer (2012) | |
Functionality: Querying/concordancing and text analysis |
Language independent |
Glossa offers a modern, simple and functional search interface with advanced post-processing possibilities for both written corpora, multilingual corpora and speech corpora. Glossa is developed at the Text Laboratory, Department of Linguistics and Scandinavian Studies, University of Oslo with support from the Norwegian contribution to the CLARIN infrastructure, CLARINO. Glossa is also freely available for download from GitHub and is easy to install on one's own server. Glossa is search engine agnostic and comes with support for the IMS Corpus Workbench and CLARIN Federated Content Search out of the box. CLARIN Centre: CLARINO Text Laboratory Centre |
|
Functionality: Querying/concordancing (treebanks) |
Language independent |
INESS is the Norwegian Infrastructure for the Exploration of Syntax and Semantics. INESS offers an open, interactive, language independent platform for building, accessing, searching and visualizing treebanks. INESS offers a user guide for querying their treebanks. CLARIN Centre: CLARINO
|
|
Functionality: Querying/concordancing, Stylometry |
Language independent |
This tool constitutes a deployment of Voyant Tools at CLARIN-DK. CLARIN Centre: CLARIN-DK |
|
Kontext at the Centre of Latvian language resources and tools Functionality: Querying/concordancing |
Latvian |
This tool corresponds to an implementation of LINDAT's KonText for Latvian resources. Eight Latvian corpora can be searched with this tool. CLARIN Centre: CLARIN-LV |
|
Latvian National Corpora Collection (LNCC) Functionality: Querying/concordancing |
Latvian, Latgalian, Lithuanian |
Latvian National Corpora Collection (LNCC) is a diverse collection of corpora representing both written and spoken language. LNCC covers various use cases and all the important text types and genres. It is a continuous multi-institutional and multi-project effort, supported by the digital humanities and language technology communities in Latvia. Currently, 34 corpora developed by 13 institutions are available in the LNCC. Most of the corpora are annotated with a uniform morpho-syntactic annotation scheme and included in the federated search. The federated search combines multiple corpora from two corpus indexer instances (endpoints) maintained by IMCS UL and NLL. Federated search includes 28 corpora (2.4 billions tokens). CLARIN Centre: CLARIN-LV
|
|
Functionality: Querying/concordancing, corpus upload and processing |
Multiple |
This is a web-based system for viewing, creating, and editing corpora with both rich textual mark-up and linguistic annotation. For visitors, the system provides a graphical user interface in which the annotated document can be visualized in a number of different ways. And for administrators of the corpus, TEITOK uses the same interface to allow easy editing of the underlying XML document, meaning administrators can correct their corpus while they are consulting it. Registration is required and Shibboleth log-in is supported. User documentation is available. CLARIN Centre: CLARIAH-CZ
|
|
Functionality: Querying/concordancing/analysis |
Norwegian Bokmål, Norwegian Nynorsk, Northern Sami, Lule Sami, Southern Sami |
This collection of tools corresponds to a , Python package and web applications allowing a user to build corpora from the vast digital collections of the National Library of Norway (currently ca. 160 billion words). Users get concordances, frequency lists and co-occurrence data. User support is available through email. CLARIN Centre: CLARINO |
|
Functionality: Querying/concordancing |
Portuguese |
This is a freely available online concordancing service to support the research usage of the CINTIL Corpus. The CINTIL concordancer allows the use of patterns to specify the occurrences to be retrieved. This permits to uncover linguistic structures of high complexity and use this service as a powerful research tool. CLARIN Centre: PORTULAN CLARIN
|
|
Functionality: Querying/concordancing |
Slovenian, Croatian, Bosnian, Serbian, Montenegrin, Macedonian, Serbo-Croatian, Bulgarian, Czech, Slovak, Polish, English, Danish, Dutch, Estonian, Finnish, French, Gaelic, German, Greek, Hungarian, Icelandic, Italian, Japanese, Latvian, Lithuanian, Portu |
This is the CLARIN.SI installation of LINDAT's KonText, comprised of the KonText front-end developed by the Czech National Corpus team and the Manatee back-end, developed by Lexical Computing. This installation offers over 50 richly annotated corpora in Slovenian and other languages. Shibboleth log-in is supported. CLARIN Centre: CLARIN.SI |
|
Functionality: Querying/concordancing |
Slovenian, Croatian, Bosnian, Serbian, Montenegrin, Macedonian, Serbo-Croatian, Bulgarian, Czech, Slovak, Polish, English, Danish, Dutch, Estonian, Finnish, French, Gaelic, German, Greek, Hungarian, Icelandic, Italian, Japanese, Latvian, Lithuanian, Portu |
This is an open-source version of the commercial Sketch Engine, produced by Lexical Computing. This installation of noSketch Engine at CLARIN.SI offers over 50 richly annotated corpora in Slovenian and other languages. CLARIN Centre: CLARIN.SI |
|
Functionality: Querying/concordancing |
Swedish |
This is Språkbanken's corpus tool for searching in large amounts of texts, including newspapers, novels and social media. CLARIN Centre: SWE-CLARIN
|
Desktop Tools
Tool | Language | Description | |
---|---|---|---|
Functionality: Concordancing/querying |
Language independent |
#LancsBox is a new-generation software package for the analysis of language data and corpora developed at Lancaster University. The latest version, #Lancsbox X has increased functionality for XML texts. A user guide is available in English, French and Japanese, along with instructional videos. See here. CLARIN Centre: CLARIN-UK
|
|
Functionality: Concordancing/querying |
Language independent |
The CLAN Programs are downloaded, installed, and used as a single application. Functionally, however, CLAN has two parts. The first part is the CLAN editor which can be used to edit files in either CHAT or CA (Conversation Analysis) format. The editor also provides a wide range of additional functions, such as audio and video playback, linkage to audio and video, fonts for Roman and non-Roman orthographies, data validation, adding codes to files, and shipping data to other programs. The second part of CLAN is the set of data analysis programs. These programs are run from a separate window called the Commands window. The results of the analytic programs are sent to the CLAN Output window. The tool is only compatible with TalkBank corpora that have CHAT annotation. An online manual is available. CLARIN Centre: TalkBank |
|
Functionality: Concordancing/querying, corpus building |
Language independent |
This tool is an XML-based system for corpus linguistics, primarily for corpus construction, but also with functionality for analysing and exploring corpora. The support team is reachable through clark-support [at] bultreebank.org (email). A user manual is also available. CLARIN Centre: CLARIN-BG
|
|
Functionality: Concordancing/querying |
Language independent |
This tool allows for text and corpus analysis. CLARIN Centre: CLARIN-UK |
|
Q-CAT Corpus Annotation Tool 1.5 Functionality: Annotating/concordancing/querying/listening to audio recordings |
Language independent |
The tools allows for manual linguistic annotation of corpora and advanced queries on top of these annotations. The tool has been used in various annotation campaigns related to the ssj500k reference training corpus of Slovenian, such as named entities, dependency syntax, semantic roles and multi-word expressions, but it can also be used for adding new annotation layers of various types to this or other language corpora. Q-CAT is a .NET application, which runs on Windows operating system. This resource is available for download from the CLARIN.SI repository. CLARIN Centre: CLARIN.SI
|
Corpus Query Tools Outside CLARIN
Online Query Tools
Tool | Language | Description | |
---|---|---|---|
Functionality: Querying/concordancing, Stylometry |
Arabic, Bosnian, Croatian, Czech, English, French, German, Hebrew, Italian, Japanese, Portuguese, Russian, Serbian, Spanish |
This is a web-based text reading and analysis environment. It is a scholarly project that is designed to facilitate reading and interpretive practices for digital humanities students and scholars as well as for the general public. It is possible to upload one's own corpus with this tool. The interface is available in a number of languages. An online user guide is available. CLARIN Centre: External |
|
Concordancer of Corpus Hedendaags Nederlands (Corpus of Contemporary Dutch) Functionality: Querying/concordancing |
Dutch |
This is a dedicated query tool, built on BlackLab software, for Corpus Hedendaags Nederlands (Corpus of Contemporary Dutch), a corpus of more than 800,000 texts taken from newspapers, magazines, news broadcasts and legal writings (1814–2013). The corpus is a combination of the 5, 27 and 38 million word corpora and the PAROLE Corpus, supplemented with newspaper texts from NRC and De Standaard (until 2013). Registration is required for using this tool. Shibboleth log-in is supported. CLARIN Centre: External |
|
Functionality: Querying/concordancing (treebanks) |
Dutch |
This is an application for searching in treebanks (i.e. text corpora in which each sentence has been assigned a syntactic structure) and for analysing the search results. It is possible to upload one's own corpus with this tool, for which registration is required. CLARIN Centre: External
|
|
Functionality: Querying/concordancing |
English |
This is a dedicated concordancer for the Corpus of Australian and New Zealand Spoken English. The corpus contains 195 million words of geolocated automatic speech recognition transcripts of video content from local governments in Australia and New Zealand, created for the study of lexical, grammatical, phonetic, and discourse-pragmatic phenomena of spoken language. Additionally, the corpus contains complete textual content of the corpus, audio files and forced alignments in Praat's TextGrid format for most transcripts. The corpus can be accessed through the CLARIN Service Provider Federation. CLARIN Centre: External
|
|
Functionality: Querying/concordancing |
English |
This is a tool for browsing the corpora available on english-corpora.org, which are formerly known as the BYU or Brigham Young University copora. CLARIN Centre: External
|
|
Functionality: Querying/concordancing, corpus upload and analysis |
English, French |
This tool includes a concordancer, vocabulary profiler, exercise maker, interactive exercises, and much more. It is possible to upload one's own corpus with this tool (10 MB limit CLARIN Centre: External |
|
SKell (SKetch Engine for language learners) Functionality: Querying/concordancing |
English, Russian, German, Italian, Czech, Estonian |
This is a simple tool for students and teachers of English to easily check whether or how a particular phrase or a word is used by real speakers of English. CLARIN Centre: External
|
|
Functionality: Querying/concordancing |
French, English |
This tool corresponds to a number of different TXM portals running at various sites and with a number of different corpora. TXM offers online analysis tools for querying language corpora. The interface is in French. CLARIN Centre: External |
|
Functionality: Querying/concordancing, corpus upload and analysis |
German |
The acronym CATMA stands forComputer Assisted Text Markup and Analysis. It is possible to upload one's own corpus with this tool. CLARIN Centre: External |
|
Functionality: Querying/concordancing |
Language independent |
This is a dedicated concordancing tool. CLARIN Centre: External |
|
Functionality: Querying/concordancing |
Language independent |
This tool gives researchers access to a large collection (corpus) of newspaper articles spanning three decades. The tool has been created by linguists to encourage curiosity in language learners. WebCorp Learn promotes playful and context-based inductive learning and enables you to discover language through exploratory experimentation. Registration is required. CLARIN Centre: External |
|
Webcorp LSE (Linguist's Search Engine) Functionality: Querying/concordancing |
Language independent |
This is a dedicated tool for the study of language on the web. The corpora were built by crawling the web and extracting textual content from web pages. Searches can be performed to find words, lemmas or phrases, including pattern matching, wildcards and part-of-speech. Results are given as concordance lines in KWIC format. Post-search analyses are possible including time series, collocation tables, sorting and summaries of meta-data from the matched web pages. It is possible to upload one's own corpus with this tool. Registration is required. CLARIN Centre: External |
|
Functionality: Querying/concordancing, analysis, visualizations |
Multiple |
I-Analyzer allows searching and exploring text corpora, visualizing trends, and downloading tables of text and metadata for further analysis. I-Analyzer is open-source software and freely available.
CLARIN Centre: External |
|
Functionality: Querying/concordancing, corpus upload and processing |
Multiple |
Sketch Engine is a commercial online corpus analysis application, used by linguists, lexicographers, translators, students and teachers. Sketch Engine contains 600 ready-to-use corpora in 90+ languages. It is possible to upload one's own corpus with this tool. Registration is required and Shibboleth log-in is supported. Support is offered via email. CLARIN Centre: External
|
|
National Corpus of Polish (IPI PAN search engine) Functionality: Querying/concordancing |
Polish |
This is a dedicated concordancer for NKJP corpora. CLARIN Centre: External |
|
National Corpus of Polish (Pelcra search engine) Functionality: Querying/concordancing |
Polish |
This is a dedicated concordancer for NKJP corpora. CLARIN Centre: External |
|
Concordancer of O corpus do português Functionality: Querying/concordancing |
Portuguese |
This is a dedicated concordancer for the Corpus of Portuguese developed by Mark Davies. CLARIN Centre: External
|
|
Functionality: Querying/concordancing |
Romanian |
This tool is used to query the Reference Corpus for Contemporary Romanian Language CoRoLa. CLARIN Centre: External
|
|
Concordancer of the Corpus del Español Functionality: Querying/concordancing |
Spanish |
This is a querying tool for the corpora from Corpus del Español, which provide billions of words of recent data from 21 Spanish-speaking countries. There are four different corpora in the Corpus del Español. CLARIN Centre: External
|
Desktop Tools
Tool | Language | Description | |
---|---|---|---|
Functionality: Concordancing/querying |
Language independent |
This is a multi-lingual concordance tool. Originally developed for native Arabic concordance, it posses basic concordance functionality, as well as English and Arabic interfaces. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a freeware corpus analysis toolkit for concordancing and text analysis. Online videos and manuals from the creator and community (Google Group). CLARIN Centre: External |
|
Functionality: Parallel Concordancing/querying |
Language independent |
This is a freeware parallel corpus analysis toolkit for concordancing and text analysis using UTF-8 encoded text files. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a concordance program that runs natively on macOS 11.3 or later.and can generate KWIC concordance lines, word clusters, collocation analysis, and word count. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool is a Windows software program that can be used to find collocations or terms in a corpus. It is a commercial tool. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool is a corpus linguistics software package which is specifically designed to find all the co-occurrences of words in a text or corpus irrespective of variation. This is a commercial tool, available for purchase on optical disc. CLARIN Centre: External
|
|
Functionality: Concordancing/querying |
Language independent |
This is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpus. The tool works with any corpus, with installers for a number of widely used ones. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a tool for doing corpus linguistics. It enables parsing, concordancing and keywording, including concordance by searching for combinations of lexical and grammatical features, and keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses. corpkit leverages a number of sophisticated programming libraries, including pandas, matplotlib, scipy, Tkinter, tkintertable and Stanford CoreNLP. CLARIN Centre: External |
|
Functionality: Concordancing/querying, text and data mining |
Language independent |
This tool is intended for corpus linguistics and for text and data mining. CLARIN Centre: External |
|
Functionality: Concordancing/querying, corpus compilation |
Language independent |
This tool can be used to compile text corpora and to carry out retrieval tasks on any corpus or selection of text files, no matter what their source or how they are organised. The tool is designed to have a maximally open architecture and can be used straight away to examine any texts users may have access to. CLARIN Centre: External
|
|
Functionality: Concordancing/querying |
Language independent |
This is a collection of open-source tools for managing and querying large text corpora (up to 2 billion words) with linguistic annotations. Its central component is the flexible and efficient query processor CQP. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
EXAKT (‘EXMARaLDA Analysis- and Concordance Tool’) is the query and analysis tool for EXMARaLDA corpora. It can also be used for corpora created with other tools (FOLKER, Transcriber, ELAN). Support is offered via the CLARIN-D Helpdesk. Manuals and how-to guides are available; there have also been training courses for EXAKT. The source code of the program is open source and accessible via GitHub. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a state-of-the-art corpus exploration program designed for parsed corpora such as ICE-GB and The Diachronic Corpus of Present-Day Spoken English. This is a commercial tool that works for ICE corpora with proprietary annotation scheme. A handbook is available. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a commercial product for analyzing word use. It can be used to study a single individual, groups of people over time, or all of social media. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a concordance programme. It is made available on a commercial basis. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool is part of a linguistic development environment, which includes functionality for text and corpus analysis. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is an open source version of Sketch Engine with certain functionality limitations (for instance, WordSketch is not available). CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a commercial software application for qualitative text and data analysis. CLARIN Centre: External |
|
Functionality: Parallel Concordancing/querying |
Language independent |
A parallel concordance programme for aligned source and target translation texts. This is a commercial tool. CLARIN Centre: External |
|
Functionality: Concordancing/querying, corpus building |
Language independent |
This is a system for managing, annotating, visualising and analysing spoken language corpora. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a simple concordancer. It is supposed to be used in exploratory analysis of XML-annotated corpora. Its primary feature lies in the automatic detection of XML tags and attributes. The search/concordancing function supports regular expressions. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding to terms are selectively labelled so that they don't overlap with other labels or points. CLARIN Centre: External |
|
Functionality: Concordancing/querying, corpus building |
Language independent |
This is a framework for generating custom web-based concordancers. It requires R and Rstudio/Shiny. A detailed setup tutorial is available. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool allows users to create word lists and search natural language text files for words, phrases, and patterns. The tool is a concordance and word listing program that is able to read texts written in many languages. There are built-in alphabets for English, French, German, Polish, Greek and Russian. The tool contains an alphabet editor which you can use to create alphabets for any other language. A help document is available. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a combination of an annotation and analysis tool for use with either simple XML files or basic plain-text files. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a free open source software application to analyze and process texts visually. Support is available. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a free smartphone app that allows users to analyze websites, tweet streams, and documents, as you explore the relationships between words in the text via an intuitive word cloud interface. It can generate graphs and statics, and share the data and visualizations. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as the researcher wants from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file. A quickstart guide, a user guide and video tutorial are available online. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This a user-friendly corpus tool for English language teaching, linguistic analysis and self-tutoring based on the Lexical Priming theory of language. Online support is available. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
The SPAADIA concordancer (32bit Windows version): a concordancer (mainly) for use with the SPAADIA corpus. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool employs lexicometry (see Scholz 2019) and text statistical analysis. It offers tools and methods tested in multiple branches of the humanities and is statistically well founded. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool offers a wide variety of tools for searching, studying, and analyzing texts. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is an integrated corpus tool with multilingual support for the study of language, literature, and translation. The latest version (3.2.0) of Wordless supports Windows 7/8/8.1/10/11, macOS 10.11 or later, and Ubuntu 16.04 or later, all 64-bit only. Both Intel-based and M1-based Macs are supported. The tool is available for download from GitHub. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This tool is capable of finding word patterns, and has functionalities for concordance, collocation, word lists and keywords. It is a commercial tool. There is a dedicated Google Group for this tool. CLARIN Centre: External |
|
Functionality: Concordancing/querying |
Language independent |
This is a simple concordancer. CLARIN Centre: External |