TY - JOUR
T1 - A multimodal grammar of artificial intelligence: Measuring the gains and losses in generative AI
AU - Cope, Bill
AU - Kalantzis, Mary
PY - 2024/6
Y1 - 2024/6
N2 - This paper analyzes the scope of Artificial Intelligence (AI) from the perspective of a multimodal grammar. Its focal point is Generative AI, a technology that puts so-called Large Language Models to work. The first part of the paper analyzes Generative AI, based as it is on the statistical probability of one token (a word or part of a word) following another. If the relation of tokens is meaningful, this is circumstantial and no more, because its mechanisms of statistical analysis eschew any theory of meaning. This is the case not only for the written text that Generative AI leverages, but by extension image and multimodal forms of meaning that it can generate. The AI can only work with non-textual forms of meaning after applying language labels, and to that extent is captive not only to the limits of probabilistic statistics but the limits of written language as well. While acknowledging gains arising from the brute statistical power of Generative AI, in its second part the paper goes on to map what is lost in its statistical and text-bound approaches to multimodal meaning-making. Our measure of these gains and losses is guided by the concept of grammar, defined here as a theory of the elemental patterns of meaning in the world—not just written text and speech, but also image, space, object, body, and sound. Ironically, a good deal of what is lost by Generative AI is computable. The third and final part of the paper briefly discusses educational applications of Generative AI. Given both its power and intrinsic limitations, we have been experimenting with the application of Generative AI in educational settings and the ways it might be put to pedagogical use. How does a grammatical analysis help us to identify the scope of worthwhile application? Finally, if more of human experience is computable than can be captured in text-bound AI, how might it be possible at the level of code to create a synthesis in which grammatical and multimodal approaches complement Generative AI?
AB - This paper analyzes the scope of Artificial Intelligence (AI) from the perspective of a multimodal grammar. Its focal point is Generative AI, a technology that puts so-called Large Language Models to work. The first part of the paper analyzes Generative AI, based as it is on the statistical probability of one token (a word or part of a word) following another. If the relation of tokens is meaningful, this is circumstantial and no more, because its mechanisms of statistical analysis eschew any theory of meaning. This is the case not only for the written text that Generative AI leverages, but by extension image and multimodal forms of meaning that it can generate. The AI can only work with non-textual forms of meaning after applying language labels, and to that extent is captive not only to the limits of probabilistic statistics but the limits of written language as well. While acknowledging gains arising from the brute statistical power of Generative AI, in its second part the paper goes on to map what is lost in its statistical and text-bound approaches to multimodal meaning-making. Our measure of these gains and losses is guided by the concept of grammar, defined here as a theory of the elemental patterns of meaning in the world—not just written text and speech, but also image, space, object, body, and sound. Ironically, a good deal of what is lost by Generative AI is computable. The third and final part of the paper briefly discusses educational applications of Generative AI. Given both its power and intrinsic limitations, we have been experimenting with the application of Generative AI in educational settings and the ways it might be put to pedagogical use. How does a grammatical analysis help us to identify the scope of worthwhile application? Finally, if more of human experience is computable than can be captured in text-bound AI, how might it be possible at the level of code to create a synthesis in which grammatical and multimodal approaches complement Generative AI?
U2 - 10.1177/26349795231221699
DO - 10.1177/26349795231221699
M3 - Article
SN - 2634-9795
VL - 4
SP - 123
EP - 152
JO - Multimodality & Society
JF - Multimodality & Society
IS - 2
ER -