Axe 5 - Dictionaries, corpora, networks, linguistic change

Responsible : Svetlana Krylosova

Axis 5 "Dictionaries, corpora, networks. Linguistic changes" is CREE's linguistic axis. The teacher-researchers work on different languages such as Bulgarian, Estonian, Finnish, Russian, Czech, Ukrainian, etc.
In the period 2019-2023, the researchers in axis 5 will continue their work in sociolinguistics, corpus linguistics, terminology, lexicology and lexicography, begun during the previous five-year period, by opening up new perspectives.

For project 5.1, an interdisciplinary team was created to launch a new line of research on the formal and computerized study of the Russian lexicon. The project 5.2 team continues its work on linguistic change and extends it to the sociolinguistic aspect of language change resulting from the concomitant use of standardized and non-standardized varieties by speakers in a linguistic community. Project 5.3 brings researchers together around the production of lexicographic tools and unilingual or bilingual corpora...

Thus, in addition to the theoretical objectives (reflections on semantics, language standardization, the process of linguistic modification), the researchers of axis 5 also set themselves the following descriptive objective: to create new lexicographic resources (dictionaries, lexical networks) and parallel corpora that could be used not only for linguistic study itself, but also as support for the teaching of our languages as well as for automatic language processing.
The realization of these long-term projects is possible thanks to the establishment of partnerships with other Inalco research teams (CERMOM, ERTIM, Plidam) but also with national partnerships (ATILF CNRS, Université Lyon-2, Université Lille 3, Université Jean Moulin Lyon 3, Université de Toulouse 2, Université de Corse ; Université de Rennes 1, Université du Littoral-Côte d'Opale) and international partnerships (Sofia University Saint Clément d'Ohrid, Université de Montréal, International Sociolinguistic Society in Sofia (INSOLISO), Association franco-estonienne de lexicographie (AFEL, Tartu, Estonia), Estonian Language Institute (EKI, Tallinn, Estonia), Estonian Language Resource Centre (EKK, Tallinn), Université de Laval, Université de Milan, Université de Sherbrooke, Jožef Stefan Institute, DFKI - German Research Center for Artificial Intelligence).

As the Axis 5 projects are complementary, researchers from the three projects work together on the preparation of study days, colloquia and the training of young researchers.

Projects:

5.1. Explanatory and combinatorial lexicology of Russian according to the lexical network approach

Responsible: Vincent Bénet (CREE, CEFR), Svetlana Krylosova (CREE)

The Lexicologie Explicative et Combinatoire du Russe selon l'approche des Réseaux lexicaux (LEC-ru) project aims to set up a team at INALCO specialized in the formal and computerized study of the Russian language lexicon. The activities of the LEC-ru team have two main features. Firstly, they are situated within a theoretical and descriptive framework known as Explanatory and Combinatorial Lexicology, which enables lexical descriptions to be formalized rigorously, and which is part of a global approach to the study of language in all its functional modules - semantics, syntax, morphology and phonetics. Such an approach also places particular emphasis on the study and modeling of lexical phraseology (locutions and collocations). Secondly, the description of the Russian lexicon is based on the development of a large-scale lexical network (Russian Lexical Network, RL-ru). Such a network is a representation of the lexicon in which words are interconnected by a standardized system of explicitly described paradigmatic and syntagmatic relations. It is a representation structure that is particularly well suited to pedagogical applications (teaching Russian as a foreign language, in particular) and automatic language processing.

The project is part of the agreement signed between Inalco and the CNRS (Analyse et Traitement Informatique de la Langue Française laboratory), with the participation of the Sens-Texte linguistics observatory at the University of Montreal.

Researchers associated with the project: Elena Akborisova Da Silva (Plidam, Inalco), Vincent Bénet (CREE & CEFR), Angelina Biktchourina (CREE), Nikolaï Chepurnykh (ATILF CNRS, Université de Lorraine), Tomara Gotkova (ATILF CNRS, Université de Lorraine), Lidia Kolzoun (ATILF CNRS & CREE), Svetlana Krylosova (CREE), Polina Mikhel (ATILF CNRS, Université de Lorraine). With the participation of Lidia Iordanskaja (OLST, Université de Montréal), François Lareau (OLST, Université de Montréal), Igor Meľčuk (OLST, Université de Montréal), Alain Polguère (ATILF CNRS, Université de Lorraine).

Main collaborations: ATILF CNRS, Université de Lorraine & OLST, Université de Montréal.

Research operations:

  • construction of a Russian Lexical Network RL-ru (responsible Svetlana Krylosova) ;
  • training of students in explanatory and combinatorial lexicology, co-supervision of internships, master's theses, co-direction of PhD theses (Vincent Bénet, Svetlana Krylosova) ;
  • establishment of national and international partnerships in lexicology in its applications (regular seminars and work sessions at Inalco, ATILF CNRS and OLST - University of Montreal) (responsible Svetlana Krylosova);
  • lancement of collaboration with four Italian universities (l'Università Cattolica del Sacro Cuore de Milan l'Università degli Studi di Bologna, l'Università degli Studi di Napoli "Parthenope", the Università degli Studi di Perugia and the Università degli Studi di Verona) in the field of comparative lexicology, phraseology and terminology (leaders Svetlana Krylosova, Polina Mikhel & project researcher 5.2 Snejana Gadjeva) ;
  • implementation of a satellite project on the lexicographical description of chemistry terms in multilingual perspective (Polina Mikhel) ;
  • implementation of a satellite project on the lexicographical description of Russian clausatives (Elena Akborisova, Nikolay Chepournykh, Tomara Gotkova, Lidia Kolzoun, Svetlana Krylosova, Polina Mikhel) ;
  • implementation of a satellite project on the lexicographic description of verbs of displacement (Svetlana Krylosova, Polina Mikhel, with the participation of Polina Voronova) ;
  • didactization of RL-ru lexicographic definitions enabling their exploitation in the classroom (creation of course materials including didactized definitions, exercises, instructions for teachers) (Lidia Kolzoun, Svetlana Krylosova) ;
  • organization, as part of Inalco's doctoral training, of lectures by Igor Mel'čuk, University of Montreal (responsible Svetlana Krylosova, with the participation of the LEC-ru team);
  • international colloquium "Lexicon and the human body", June 2021 (scientific responsible Svetlana Krylosova, with the participation of the LEC-ru team as well as Project 5 researchers.2 Snejana Gadjeva & Gueorgui Armianov) ;
  • publication of a special issue of an international journal on lexicology (Svetlana Krylosova) ;
  • international colloquium on comparative lexicology and terminology in October 2023 (scientific leaders Snejana Gadjeva CREE, Claudio Grimaldi University of Naples, Svetlana Krylosova CREE, Maria Teresa Zanola University of Milan; with the participation of the LEC-ru team as well as researchers from projects 5.2 and 5.3, Gueorgui Armianov & Kaja Dolar) ;
  • participation in doctoral seminars, study days and international colloquia, publications of articles ;
  • help of the LEC-ru team in setting up the Ukrainian Lexical Network (ATILF CNRS) (Tomara Gotkova, Svetlana Krylosova & Polina Mikhel).

Key words:"Meaning - Text" theory; Explanatory and Combinatorial Lexicology; Russian language; Lexical Network; lexicology; phraseology; terminology; teaching / acquiring meaning.

5.2. Internal and external linguistic changes: similarities and divergence

Responsible persons: Snejana Gadjeva (CREE), Gueorgui Armianov (CREE)

The project aims to bring together researchers who study changes in all areas of languages, associating them with evolutionary processes observed at given periods in the history of the societies that speak them. These linguistic changes are the result of various social and state mutations marked by a renunciation of past realities and an adoption of new forms of linguistic expression.

The originality of the project consists in the analysis of contemporary linguistic data, collected during field surveys or drawn from unpublished corpora. A sociolinguistic approach, both synchronic and diachronic, is used to explore the link between linguistic productions and the social contexts in which they occur. Particular attention is paid both to the interaction between standard language and non-standard varieties (sociolects, dialects), and to the interference between languages spoken in today's bilingual and plurilingual communities.

The variety of social contexts in which the linguistic practices studied manifest themselves makes it possible, on the one hand, to define specific external factors motivating linguistic change within a community. On the other hand, this diversity of situations makes it possible to cross-reference different linguistic data (Slavic, Balkan, Turkish, Arabic, Hebrew and Jewish languages) in order to demonstrate the existence of a dynamic of transformation common to several societies, otherwise attached to distinct geopolitical (European, Balkan, Mediterranean) and historical (post-Ottoman, post-Austro-Hungarian, post-Soviet) spaces.

Researchers associated with the project: Georgui Armianov (CREE), Oleg Chinkarouk (CREE), , Snejana Gadjeva (CREE), Svetlana Krylosova (CREE), Ilona Sinzelle-Poňavičová (doctoral student CREE), Janna Vassiliotchek (CREE), Krassimira Aleksova (Sofia University Saint Clement of Ohrid), Dominique Samson (CREE).

Main collaborations:Université Jean Moulin Lyon 3, Sorbonne Université, CNRS, Université de Rennes 1, l'Université du Littoral-Côte d'Opale, Université de Sofia Saint Clément d'Ohrid, International Sociolinguistic Society in Sofia (INSOLISO).

Research operations:

  • participation in the organization of the international colloquium Lexique et corps humain (with colleagues from project 5.1; Snejana Gadjeva, Gueorgui Armianov);
  • publication of a thematic volume of the journal Slovo, 2022 (dir. Snejana Gadjeva & Svetlana Krylosova ; with the participation of Gueorgui Armianov, Ilona Sinzelle-Poňavičová, Jeanna Vassilioutchek & Dominique Samson) ;
  • publication of a chapter on non-standard varieties: Sub-standard varieties, slang, argot and jargon, In: Neil Bramer (ed.), Oxford guide to the Slavonic languages, Chapter 7.5, Oxford University Press (Gueorgui Armianov) ;
  • Doctoral seminar Standard and non-standard in the languages of medieval Europe, for doctoral and master's students (leader Gueorgui Armianov; with the participation of researchers from projects 5.1 and 5.3: Antoine Chalvin, Oleg Chinkarouk, Snejana Gadjeva, Svetlana Krylosova & Ilona Sinzelle-Poňavičová, as well as researchers from other Inalco teams (Plidam, Sedyl)) ;
  • L3 Linguistics course seminar: Dynamics of language contacts (taught by Snejana Gadjeva) ;
  • Language Science Master's seminar: Processes and effects of language contacts (taught by Snejana Gadjeva) ;
  • Annual participation in several collective areal or thematic seminars at Inalco: Linguistics of Balkan languages (Gueorgui Armianov, Snejana Gadjeva); Grammar issues in the Finno-Balto-Slavic area (Snejana Gadjeva, Oleg Chinkarouk, Jeanna Vassilioutchek); Methods in dialectology: theory and practice (Snejana Gadjeva); Texts. Linguistics. Translation (Svetlana Krylosova); History and social sciences: theories and methods (Snejana Gadjeva) ; Séminaire du Quai Branly (Snejana Gadjeva) ;
  • participation in international study days and colloquia ;
  • participation in the CREE Debates (Gueorgui Armianov, Snejana Gadjeva) ;
  • participation in the ANR Atlas of the Balkan linguistic area project (Snejana Gadjeva) ;
  • field surveys among Bulgarian Turkish speakers, audio data collection (Snejana Gadjeva, in connection with the work of CREE Axis 4 researchers) ;
  • Bulgarian language translation expertise (Gueorgui Armianov, Snejana Gadjeva).

Key words:linguistic change; social change; sociolinguistic dynamics; standard / non-standard varieties; corpus linguistics.

5.3. Corpuses and, dictionaries and pedagogical tools

Responsible: Antoine Chalvin (CREE)

The aim of this project is to provide a space for cooperation for teacher-researchers who, within CREE, are working on the production of lexicographic, pedagogical or bilingual corpus tools, as well as on the analysis of these tools and the ways in which they are developed or used. The project fosters the exchange of experience, the dissemination of best practices and familiarization with the relevant IT tools for producing dictionaries and corpora. The project also focuses on new collaborative lexicographic practices (collaborative dictionaries such as Wiktionnaire), analyzing their specificity in relation to traditional lexicography (comparison of collaborative dictionaries with academic dictionaries using the resources offered by automatic language processing). Exchanges of experience and collective reflection will take the form of study days involving external researchers.

Researchers associated with the project: Gueorgui Armianov (CREE), Vincent Bénet (CREE & CEFR), Antoine Chalvin (CREE), Oleg Chinkarouk (CREE), Iryna Dmytrychyn (CREE), Kaja Dolar (CREE), Snejana Gadjeva (CREE), Madis Jürviste (Association franco-estonienne de lexicographie, Tartu), Svetlana Krylosova (CREE), Heete Sahkai (Estonian Language Institute, Tallinn), Natalya Shevchenko (CREE & CRTT, Lyon-II 2), Olena Saint-Joanis (ELLIAD, Univ. Franche-Comté), Anatoly Tokmakov (Inalco), Marie Stachowitsch (CREE), Janna Vassilioutchek (CREE), Marie Vrinat-Nikolov (CREE).

Main collaborations: Inalco's ERTIM team, Association franco-estonienne de lexicographie (AFEL, Tartu, Estonia), Estonian Language Institute (EKI, Tallinn, Estonia), Estonian Language Resource Center (EKK, Tallinn), Université Lille 3, Université Jean Moulin Lyon 3, Université de Toulouse 2, Université de Corse, Université de Laval, Université de Milan, Université de Sherbrooke, Jožef Stefan Institute, DFKI - German Research Center for Artificial Intelligence.

Research operations:

  • Launch, within CREE, of the new international project Réseau de recherches sur les ressources lexicales collaboratives focusing on metalexicographic issues that concern dictionaries, encyclopedias, corpora and other types of collaboratively constructed resources (Kaja Dolar, Antoine Chalvin & Marie Steffens) ;
  • organization of a study day on new lexicographic practices, 2021 (scientific leader Kaja Dolar) ;
  • creation and development of the Ukrainian Module for NooJ in open access (167,000-word electronic dictionary & morphological and syntactic grammars), 2018-2022 (Olena Sait-Joanis) ;
  • publication of a collection of articles on register marks in monolingual and bilingual dictionaries, 2020 (dir. Georgui Armianov) ;
  • publication of a Bulgarian method, 2020 (Snejana Gadjeva & Marie Vrinat-Nikolov) ;
  • creation of audio resources in Bulgarian, 2017-2018 (Snejana Gadjeva & Georgui Armianov) ;
  • publication of the Grand dictionnaire estonien-français = Suur eesti-prantsuse sõnaraamat (GDEF), 2021 (dir. Antoine Chalvin) ;
  • publication of a method of Russian press language, 2018 (Svetlana Krylosova, Marie Stachowitsch & Anatoly Tokmakov) ;
  • semantic labeling of the Russian electronic dictionary on NooJ software: implementation of hierarchical semantic tags based on Tuzov's dictionary, 2017-2022 (Vincent Bénet) ;
  • continued work on the Dictionary of Bulgarian Verbs (Marie Vrinat-Nikolov, Gueorgui Armianov & Snejana Gadjeva) ;
  • continued work on the Dictionary of Bulgarian Nouns and Adverbs and Grammatical Tables, 2021 (dir. Gueorgui Armianov & Snejana Gadjeva).

Key words: lexicography, dictionaries, parallel corpora, collaborative lexical resources, semantic labeling, NooJ software, language learning, slang, Bulgarian, Estonian, Finnish, Russian, Ukrainian.