Despite the rise in commercial applications of LLMs, scholarship has neglected an in-depth appreciation of the free contribution of subjects communicative social action as the engine of training data production as a necessary moment in digital processes of valorisation. This issue was popular in the analyses of the post-operaist tradition of free labour, but have since been missing in examinations of more recent technological developments, specifically in what concerns AI. Although the work of Baudrillard is semi-frequently evoked in descriptive critical assessments of new technologies, there is little integration of Baudrillard's work in contemporary studies of AI. This paper aims at contributing in this direction, by showcasing the utility of Baudrillard’s concepts of simulation, subject function, masses, and the social, for an understanding of immaterial free labour in the context of Large Language Models (LLMs). Drawing from the recent phenomena of the sale of Reddit communications content to OpenAI as training data, I propose the notion of digital common as the pre-trained collected and recorded data of actual human communication through digital systems. I propose the framework of the subject function as expounded by Baudrillard, in both its individual and collective aspects, as a necessary conjuncture to understand how commercial applications of conversational LLMs fit into the broader landscape of digital political economy. I suggest that the role played in this specific application derives from the appropriation of freely generated user-data as constituting the digital common and as carrying a specific conception of subjectivity as functional.
Read full abstract