Alignment of lexicographic resources, contributions to NLP tasks, development of multilingual teaching materials, and TEI Lex-0 encoding,
ELEXIS – European Lexicographic Infrastructure
Reference: GA 731015; DOI 10.3030/731015
Funding body: European Union’s Horizon 2020
Coordination: Simon Krek (Jožef Stefan Institute in Ljubljana, Slovenia)
Position: Research fellow
Period: 1 February 2018 – 31 July 2022
Funding: €4,999,967.50
Website: https://cordis.europa.eu/project/id/731015/reporting
This European project aimed to develop a shared infrastructure for digital lexicography, fostering interoperability and the integration of lexicographic resources at the European level.
Main responsibilities and tasks carried out:
-
Development of lexicographic resource alignment tagging:
Creation of strategies for aligning different dictionaries, focusing on the Academy’s Dictionary and the Dicionário Aberto;
Implementation of methodologies to ensure standardisation and interconnection of resources. -
Contribution to WP3 (Lexical Data for NLP):
Development of tasks on sense disambiguation and linking of lexical entities, essential for lexico-semantic analyses in NLP applications;
Use of computational techniques to improve accuracy in the identification and categorisation of lexical senses. -
Contribution to WP5 (Training and Education):
Creation of open-source multilingual teaching materials;
Development of educational resources ensuring the project’s sustainability. -
Encoding of lexicographic entries according to TEI guidelines:
Adaptation of lexical entries to the TEI Lex-0 format, ensuring compliance with international encoding standards and facilitating integration into digital platforms;
Guarantee of quality and accuracy of lexicographic data, while respecting the specificities of each resource. -
Participation in regular team meetings:
Active collaboration in defining strategies and monitoring project activities;
Contribution to the planning and implementation of development and validation stages for lexicographic resources.