Generation or interpretation? - AI in debate

Study day organized jointly by Inalco's ERTIM laboratory, the Institut Ferdinand de Saussure and the La Reconstruction collective.
Concept de transformation numérique de fond de puce de technologie d'IA
Ai technology microchip background digital transformation concept © Rawpixels / Freepik‎

With alarmist and optimistic statements contradicting each other in the media, this study day aims to help identify the status of generative AI productions. To this end, it brings together the perspectives of corpus linguistics, theoretical computer science and semiotics.

The sciences of culture are primarily concerned by the productions of AIs, whether texts, images or music, in particular. Yet their semiotic status depends on their mode of elaboration: yet automatic generation, even when based on corpora, differs radically from human creation, constrained by discourses, genres, contexts of genesis and anticipated interpretation. Could prompts take their place?

The question of meaning then arises as follows: is it reiterated projection or renewed invention? It concerns the sciences as much as the arts, as well as all sectors of culture.

In a despatialized and desymbolized society what becomes of education, law, medicine?

Beyond the economic and ecological issues, which are nonetheless prevalent, there is finally the question of adherence to algorithmic governmentality and the model of society it induces, through a disturbing seduction peculiar to transhumanist currents.

Program

Morning

Session chairman Jean Lassègue

9:30am-9:45am: welcome and introduction by Damien Nouvel, director of ERTIM

9:45am-10:15am: Jean Rohmer - Is the computer scientist condemned to invent machines to imitate?

10:15am-10:45am: Mathieu Valette - Cultural variation and fragmentation of the symbolic environment

10:45am-11am: Discussion

11am-11:15am: Break

11:15am-11:45am: Damon Mayaffre - Idiolectal descriptions and Artificial Intelligence. What does deep learning tell us about texts?

11:45am-12:15am: Laurent Vanni - Word plunging, Convolution and Self-attention. A.I. prediction, creation or description of texts?

12:15am-12:30am: Discussion

Afternoon

Session chair: Jean Rohmer

2pm-2:30pm: Jean Lassègue - The status of writing in AI, old and new formula

2:30pm-3pm: Giuseppe Longo - The machine and the animal

3pm-3:15pm: Discussion

3:15pm-3:30pm: Break

3:30pm-4pm: François Pachet - Music generation: what exactly is the problem?

4pm-4:30pm: François Rastier - AI: neither servant, nor mistress?

4:30pm-4:45pm : Discussion

4:45pm-5pm : Break

5pm-6pm: Round table and general discussion, moderated by Maryvonne Holzem,, Hélène Tessier and Santiago Guillén.

6pm : Cocktail

Summaries of papers

Jean ROHMER: Is the computer scientist condemned to invent machines to imitate?

Summary. - For a computer scientist, to program is to create. But can his creatures create themselves? Computer science - a term little used these days - is the explosive encounter between incredible stability and equally incredible dynamics. The stability of the design of Von Neumann's machine, and the dynamics of the miniaturization of transistors, which draws this machine with ever finer strokes. We're looking at strokes 10 atoms wide within the next decade. In this way, we're creating an unprecedented reticular material, with dimensions ranging from the atom of a silicon diamond to the diameter of a planet set with transatlantic cables. A digital sponge, eager to absorb everything it sees and hears, everything that's said and written. And what juice can come out of it, if we squeeze the sponge? This is the question posed by deep learning. In the usual sense, a work is unique. But in the digital world, identity doesn't exist. Any configuration can be moved, duplicated and modified without delay and ad infinitum. What is a work of art seen by one person at most, or seen differently by each person, through his or her helmet - or mask? The computer scientist, caught up in this vertigo, invents languages in an attempt to reduce the great gap between meaning and silicon. These languages are themselves works of art, sometimes masterpieces, but are they capable of representing the works of culture?

Bio-bibliography. - Jean Rohmer, ENSIMAG engineer, Doctor -ès- Sciences, started out as a researcher at the Grenoble computer laboratory, then at Inria, in the field of machine architecture, and on these occasions defended a doctor-engineer thesis on parallel machines then a doctor d'État thesis on "database machines". He went on to create and direct the Bull Group's research and commercial activities in Artificial Intelligence. His teams have built some of the world's largest expert systems. In particular, he worked in the field of logic applied to databases, and was one of the initiators of the "Datalog" discipline. He went on to develop IDELIANCE software, first for the eponymous company and then for Thales, used in particular in the field of military intelligence. He has held a number of positions at the Pôle Léonard de Vinci, and is now President of the Fredrik Bull Institute. He continues to develop IDELIANCE as a "literary calculation" tool.

Mathieu VALETTE: Cultural variation and fragmentation of the symbolic environment

Summary. - Although for some time to come they may be considered incommensurable with human cultural creations, the new non-human pseudo-cultural artefacts resulting from deep learning methods are unprecedented in that they can be produced on an industrial scale and may soon outnumber human creations.

These pseudo-cultural objects, to which we can add their social counterparts (artificial profiles, bots, human communities whose cohesion is ensured by algorithms, communities of hybrid AI and human profiles, etc.), are part of a chimerization of human culture.If it's possible to generate every possible pseudo-cultural object, if it's possible to confine populations to communities, will we be moving towards a problem of shared cultural goods (commons and products) rather than a shared common culture?

Cultural fragmentation, much more so than the generation of malicious content (fake news, industrialized trolls), would indeed appear to be one of the main visible dangers of generative AI, not because it would carry its principles in its DNA, but because of the convergence of the scientific and economic agendas that preside over its rise. Both are based on the desire to satisfy all individuals perceived as users (or customers). A review of the scientific literature on automatic language processing shows that the current challenge is not so much generative performance, but rather the diversification of the offer to bring content into cultural conformity with target users (cultureLLM, perspectival annotations, the impact of the professional framework DEI - Diversity, Equity, and Inclusion, etc.).) because, contrary to popular belief, generative AI does not completely free itself from the decisive step of pre-processing input data, which conditions outputs despite the opacity of processing.

According to this reading, the danger of AI generation would not be to compete with human cultural expression but to attack social organization. It leads us to consider the problem in terms of cognitive encirclement, and places it within the field of "cognitive warfare", i.e. the study of offensives designed to irrevocably transform the way populations think in order to weaken them or make them susceptible to new ideas.

Bibliography.

  • Garapon A., Lassègue J. (2021) Le numérique contre le politique. Paris: Presses Universitaires de France.
  • Harbulot, Chr. (2024) La légitimité civile de la guerre cognitive, , Ingénierie cognitique, 7-1, special issue Guerre cognitive, London: ISTE OpenScience, 13-16.
  • Rastier F., (2025) L'AI m'a tué. Témoignage d'un monde alternatif, Paris, Intervalles, in press.
  • Valette, M. (2024) Guerre cognitive, culture et récit national, Ingénierie cognitique, 7-1, special issue Guerre cognitive, London: ISTE OpenScience, 6-12.

Mathieu Valette is a linguist and professor at Inalco. His work focuses in particular on the relationship between linguistics and automatic language processing and corpus semantics.

Damon MAYAFFRE: Idio-lectal descriptions and Artificial Intelligence. What does deep learning tell us about texts?

Summary. - Skeptics can't get enough. Artificial intelligence of texts, whose best-known achievement is chat GPT, has successfully invaded our lives and our laboratories.

However, the machine has neither intelligence nor ethics. The avatexts it produces are not based on a truth predicate and cannot claim to be either good, beautiful or evil. Moreover, in the absence of any intention on the part of the machine, other than stochasticity, the reader would be unable to engage in a classical interpretive journey on the textual forgeries generated; not created.

Our questions focus on understanding how AIs work, a prerequisite for assessing the heuristic added value that deep learning treatments can have in the analysis of textual corpora: the interpretability/explicability of models is the essential question and prerequisite for any scientific (vs. commercial) use of AI. In other words, AI, more than any other automatic processing, "presupposes a hermeneutics of software outputs" (F. Rastier, La mesure et le grain, Champion, 2011: 43).

We'll argue that convolutional models (CNNs) have the power to account for the syntagmatic axis, i.e. they exhibit the combinations salient on the text string. Whereas transformer models have the power to account for the paradigmatic axis, i.e. they identify the selections or "associative relationships" (Le Cours, Chapter V, pp. 170-175 of the 1972 ed.) of corpus texts. In both cases, and in a firmly complementary way, it's to an effort at co(n)textualization that we call - the word in syntagmatic relation with its immediate co-text, the word in association with its paradigm co-religionists in memory or corpus - for a semantics not formal but a corpus semantics.

Damon Mayaffre is a doctor in history and Chargé de Recherche at the CNRS in the Nice laboratory of Sciences du langage. CNRS - Université Côté d'Azur - UMR 7320 Bases, Corpus, Langages

Laurent VANNI: Word Plongements, Convolution and Self-attention. A.I. prediction, creation or description of texts?

Summary. - The nature of A.I. machine outputs must be questioned with regard to the algorithms mobilized. Deep neural networks are above all based on (hyper)parameters that stabilize (by learning) results according to a predetermined (supervised) task. The generalization of this task is a well-known problem, unsolved by deep learning. The prediction of a word after a prompt and from close to a text should therefore not be generalized to the production of an answer but rather to an illusion credible enough to be studied by the Humanities.

So, we need to get down to the level of the intermediate layers of deep neural networks. While some A.I. (hyper)parameters represent incomprehensible combinatorics for humans others on the other hand shape interpretable representations of texts (Vanni et. al. 2018). A.I. can thus slide from a predictive task to an exploratory, descriptive task of texts. Today, we propose with a Transformers-like architecture (Vanni et. al. 2024) a way to analyze training corpora and glimpse the statistical phenomena at play in texts that give the illusion of GPT-like generative creativity.

References:

  • Mayaffre D. and L. Vanni (eds) (2021). L'intelligence artificielle des textes: des algorithmes à l'interprétation, Paris: Champion.
  • Vanni L, Mahmoudi H, Longrée L. and Mayaffre D. (2024 in press). Intertextuality detection using Multi-channel Convolutional Transformer (MCT) in G. Giordano, M. Misuraca (eds.), New Frontiers in Textual Data Analysis, Springer.
  • Vanni L., Ducoffe D., Mayaffre M., Precioso P., Longrée D. et. al. (2018). Text Deconvolution Saliency (TDS): a deep tool box for linguistic analysis. 56th Annual Meeting of the Association for Computational Linguistics, July 2018, Melbourne. ⟨hal-01804310⟩

Laurent Vanni is a Doctor of Computer Science and Research Engineer at the CNRS in Nice's Language Sciences laboratory. CNRS - Université Côté d'Azur - UMR 7320 Bases, Corpus, Langages

François PACHET: Music generation: what exactly is the problem?

Summary. - Since its inception, AI has been interested in music generation. We will review recent advances in the field with the aim of attempting to define the problem, and arguing that, in a way, the problem of generation, seen as that of the automatic exploration of a style defined by a learning set, can be considered solved. Nevertheless, other problems arise, more difficult and much more interesting because they require new conceptualizations of our relationship to tastes. On the one hand, the definition of an "intrinsic quality" of generated music is vague, and is generally confused with that of the quality of generation algorithms, which makes evaluation tasks of these algorithms difficult. On the other hand, we will show that popularity data cannot be used as "ground truths" to realize the dream of hit generation. Finally, we'll show how the phenomenon of appropriation when using AI technologies is at least as important as that of intrinsic quality. We will illustrate these points with various experiments and results obtained over the last ten years.

Biography - François Pachet was director of the Spotify Creator Technology Research Lab team, whose aim was to develop new tools to aid music creation. Previously, he created the Flow Records label, which produced and released the double album "Hello World", the first pop music album composed with AI. This album is the result of a collaboration between Benoit Carré alias SKYGGE and numerous musicians around technologies developed during the ERC project "Flow Machines". The project also produced a Beatles pastiche, "Daddy's car", generated using constrained Markovian generation techniques.

Before Spotify, François Pachet was director of the SONY Computer Science Laboratory in Paris, where he created the music team, still active today. He is also a guitarist (classical, jazz) and has composed and published several albums. His expanded book "Histoire d'une oreille" attempts to describe the ontogeny of a musical ear through a series of singular listening experiences.

Jean LASSÈGUE: The status of writing in AI, old and new formulas

Summary. - I would like to attempt to describe AI in its old and new formulas based on Turing's enigmatic phrase contained in his 1950 article "For me, mechanism and writing are almost synonymous". The essentially graphical character of AI will be placed in its historical context, and I will attempt to see what happens to writing in the new paradigm of neural networks.

Bio-bibliography. - Articulating epistemology, anthropology and history, Jean Lassègue's research focuses on symbolic mediations, in particular the writing systems of languages and numbers, and their consequences on the elaboration of knowledge, from the exact sciences to the legal sciences. As a philosopher, he examines the anthropological conditions of culture. His publications include Turing (Les Belles Lettres, 2003), Cassirer, du transcendantal au sémiotique (Vrin, 2016), with Antoine Garapon Justice digitale (PUF, 2018) and Le numérique contre le politique (PUF, 2021) as well as a book written with Giuseppe Longo to be published by PUF in January 2025 on the controversial status of the digital.

Jean Lassègue is a philosopher, director of research at the CNRS, director of the Centre Georg Simmel - Recherches franco-allemandes en sciences sociales (CNRS-EHESS).

Giuseppe LONGO: La machine et l'animal

Summary. - As Yann Le Cun and the leading mathematicians working on the new AI have explained very well (see the debate at the Collège de France: Work in the 21st century: Law, techniques, ecumene), the work they are doing is based almost entirely on "searches for optima". These methods are largely borrowed from mathematical physics (wavelets, renormalization, etc.) and have been enormously enriched by the talents of a vast number of researchers around the world. Generative AI, such as Large Language Models (LLM or ChatGPT and similar), also uses various theories of random graphs, based on specific optimality methods (optimized trade-off, maximal coupling...) combined with statistics. Digital applications range from the analysis and administration of Internet networks to the most recent AI applications, including LLMs. The results can be quite unpredictable, like most emerging phenomena in physics. We will outline the difference between the unpredictability of the many emergent phenomena described in the inert and the production of novelty in the living state of matter. The former, produced on the basis of maximality or probability criteria (the greatest number of connections, the shortest paths in a graph... sometimes with a little randomness added from outside), force us to follow averages or emerge in accordance with mean fields. We'll leave it to the audience to discuss the possible comparison with animal (or human) cognition.

  • Longo G. (2023), Le cauchemar de Prométhée. Les sciences et leurs limites, Preface by Jean Lassègue, afterword by Alain Supiot. PUF, Paris.
  • Lassègue J., Longo G. (2024). L'alphabet de l'esprit. Critique de la raison numérique. PUF, Paris, forthcoming).

Biography - Giuseppe LONGO is Director of Research Emeritus (DRE) at the CNRS, attached to the Cavaillès interdisciplinary center at the École Normale Supérieure (ENS) in Paris. A member of the ENS Mathematics and Computer Science departments, he was previously Professor of Mathematical Logic and Computer Science at the University of Pisa. He spent three years as a visiting researcher and professor in the USA (successively at Berkeley, MIT and Carnegie Mellon), and several months in Oxford (UK) and Utrecht (NL). From 1990 to 2015, he founded and edited a major Cambridge University scientific journal, Mathematical Structures in Computer Science. He also founded two epistemology collections oriented towards mathematics and the natural sciences. He has co-authored around a hundred articles and five books, many in collaboration with physicists, biologists and philosophers. His current project is to develop an epistemology of new interfaces exploring alternatives to the new alliance between computational formalisms and the governance of man and nature by algorithms and "optimality" methods that claim to be objective.

Centre Cavaillès, République des Savoirs, CNRS and Ecole Normale Supérieure, Paris,

see also:

François RASTIER: AI: neither servant, nor mistress?

Summary. - In the mystifying and undoubtedly mythical discourse that surrounds it, Artificial Intelligence brings together three major themes of our present: the post-truth, as it contributes to generating and propagating fakes, texts, voices and images, to create a fake world; the obscure data that constitute the learning corpora of connectionist AI systems; finally, through a simulation, the transhumanist project of a fusion between brain and machine, "natural" intelligence and artificial "intelligence".

Or data are not or should not be datasets covered by industrial secrecy, but what we give ourselves - this should at least be the case in corpus linguistics and in the cultural sciences as a whole. Indeed, the work of analysis always begins with the qualification of the data as such, if only by the qualification and delimitation of the corpus of study. No self-evident fact presides over the qualification of data: thus, the list of words absent in a text, obtained by comparison with other texts in the same corpus, is data of prime importance, and yet it appears to no one on first reading and corresponds to no character string that can be isolated by a word-cruncher or become a token.

In other words, the datasets of large generative AI systems can in no way replace corpora, or even enable them to be analyzed. On the other hand, corpus-based learning systems are perfectly capable of describing local regularities. However, we still need to tackle the problem of the explicability of results, which obviously conditions their improvement.

Using linguistics as an example, the talk will address questions such as:

  • Data versus corpora? Artificial texts: what is their hermeneutic regime?
  • Equipped science or artificial knowledge? Does AI serve to objectify this world or substitute another?

Although the brilliant technologies of generative AI are being exploited and biased to their advantage by the big Internet companies and various even more dangerous players, they can be put to good use by the sciences of culture, discerning generation and interpretation on the one hand, generation and creation on the other.

Main references: Special issue of Langages, Semantics and Artificial Intelligence (1987), Semantics and Cognitive Research (PUF, 1991), L'analyse sémantique des données textuelles (dir. Didier, 1995), La Mesure et le Grain. Sémantique de corpus(Champion, 2011), "Data vs Corpora", in Mayaffre and Vanni, eds, L'intelligence artificielle des textes (Champion, 2023), L'AI m'a tué. Testimony of an alternative world(Intervalles, Paris, forthcoming).

Biography. - François Rastier, Honorary Research Director at the CNRS, is associated with the Texte, Informatique, Multilinguisme research team at Inalco (Paris). He is president of the Institut Ferdinand de Saussure and co-director of its La Reconstruction program. He began research at the CNRS in a Semantics and AI position, at the Laboratoire d'Informatique et de mécanique pour les sciences de l'ingénieur (Orsay, 1983-1993), then specialized in corpus semantics.

Round table and general discussion, with the participation of Maryvonne Holzem, linguist (University of Rouen), Hélène Tessier (jurist and psychoanalyst, Saint Paul University, Ottawa), Santiago Guillén (semiotician, ICAR laboratory, University of Lyon 2).