Careful language, machine utterance
Tricked by AIs, I begin to consider a human strength that doesn't feel as unique as it once was.
In the midst of the confusion of machine and human voices, the common element is language—maybe its debasement and, more certainly, its definition.
Looking back at this post, which I first created in my Substack “Dashboard” back in April, I see traces of an attempt. I wanted to define and clarify language in light of utterances from AIs and Large Language Models, but with each draft I felt my own language struggle—debased by complexity and oh-too-monstrous scale. What I started in April got larded and layered—so much so, in fact, that one night I decided it was better to dismember my inflating and lumbering draft, separate its innards, and see if they might go their own clearer and better ways.
So, this post might end up being first in a series of indeterminate length. Three or four posts, maybe? Appearing occasionally (maybe) over a few months?
We’re back in the seminar room now, the fall semester having kicked off in late August. It is the season of “our complex relationships with technology,” the seminar topic. It’s also the season of more developments in language models or, perhaps more accurately, the way that language models manifest in the world.
Now they talk. Quite fluently, too.
ICYMI, when ChatGPT got a flirty human voice
Two examples have come up that I think explore this recent wrinkle on the technology: The first is Evan Ratliff’s exceptional podcast Shell Game that appeared in July and concluded its first season (hopefully first of many) in August. “Shell Game is a new show, about things that are not what they appear,” Ratliff said in his initial post. “For the first season, I’ve spent the last six months finding out what would happen if I made a digital replica of myself and turned it loose in the world.” He created a “voice clone” of himself and placed it into situations where he would have found himself. The results are revealing, sometimes humorous and sometimes not so humorous.1
It’s a great podcast exploring a pressing topic.
The second example is the audio feature that was very recently added to Google’s experimental NotebookLM, which focuses Google’s LLM (Gemini 1.5) on documents you put into the system. NotebookLM allows you to interact with the documents in a chatbot-like manner. The new audio feature allows you to enlist the LLM to talk about “sources” you’ve uploaded. In just a few minutes, the new tool creates a podcast-like discussion of sources. The audio AI products I’ve created in NotebookLM were easy listening, even about an essay about Heidegger (click to hear it below), which probably tops out with sophistication and depth approximating Malcolm Gladwell levels. You’ll hear typical features of human-voiced podcasts: banter, conversational exchanges, human-like intonations, ums and uh-huhs. The examples I made even seemed to trace a narrative arc. The tool is impressive … and maybe a little scary.2
A NotebookLM-created “podcast” on Joshua Rothman’s “Is Heidegger Contaminated by Nazism?” The New Yorker, April 28, 2014. https://www.newyorker.com/books/page-turner/is-heidegger-contaminated-by-nazism.
It’s safe to say that everyone, except for the delusional or the very nerdy, has been amazed by the way that current LLMs have simulated humanlike language. Sure, they have limitations and (often gross) imprecision, but their statistical mechanisms and neural networks impersonate so well that it’s hard to label LLM results as something other than human.
You might be tempted to say that the utterances from human mouths and lips are similarly statistical, probabilistic. You might even call that a reality that LLMs have revealed. That is: humans are machine-like. (You might be tempted, but you’d be wrong.)
YOU MIGHT BE TEMPTED TO SAY THAT THE UTTERANCES FROM HUMAN MOUTHS AND LIPS ARE SIMILARLY STATISTICAL, PROBABILISTIC. YOU MIGHT EVEN CALL THAT A REALITY THAT LLMs HAVE REVEALED. THAT IS: HUMANS ARE MACHINE-LIKE.
It’s probably safe to assume that AI voices will entirely take over the space where phone menu systems once reigned—a technological replacement of an older annoying technology. If you’ve called a drugstore, you’ve probably already experienced the pleasant voices that also seem to have ears to hear, if a bit laggy from “latency.” At least for a while, call centers house machine voices, too, beside real human voices from real people occupying rows of cubicles with phones and computer displays. Soon, nearly every bureaucratic system will adopt the robots, so that we’ll interact with their forms, processes, and policies as if we’re interacting with humans. They will even seem as empathetic as they will certainly be unyielding, but they will make up the unnatural embodiment of a rigid capitalist structure even as they will be cheaper and more compliant than their human predecessors.
When the auditory space separating human voices from machine mimicry narrows so as to be indiscernible, perhaps we’ll discover other means to guess when we’re talking with a clone and when we’re talking with a human. (Evan Ratliff implicitly tests that in his podcast, by the way.)
A nag of eloquent machines
Okay, they’re not really eloquent machines, at this point at least. But they have gotten better very quickly, and they seem to be improving in their ability to mimic human speech in lots of languages, too. Do they write good fiction? Nope. Can they draft a serviceable office memo or and email? Yup. Lately, sniffing LLM-prose in a text is similar to smelling someone’s B.O. As The Atlantic’s Caroline Mimbs Nyce pointed out, “Did a chatbot write this? is not a compliment.”
It turns out that LLM detection, unlike B.O. detection, involves some hallucination on the human side. People (like, teachers) can’t reliably tell when an LLM is doing the homework, and AI-powered apps are about as reliable a judge of prose. So, rather than serving as a straight-forward negative judgment of writing quality, Did a chatbot write this? more clearly signals deep mistrust. Chatbot prose poisons a channel of human-to-human communication, so that even human language ends up suspect.3
The improvement of LLM utterances has had a different reception from, say, an improvement in home appliances like vacuum cleaners. While new and improved vacuum cleaners might be praised for efficiency, LLMs speaking human language elicit at best cautious affirmation or, just as often, hostility (not counting mindless hype and awe, of course). For the LLMs, recently endowed with human like speech, nudge the edges of humanity, besides posing a threat to livelihoods. They gradually force a renegotiation of identity in ways that mere tools do not, and in doing so may reframe privileges that humans have assumed because of human language and intelligent awareness.
RATHER THAN SERVING AS A STRAIGHT-FORWARD NEGATIVE JUDGMENT OF WRITING QUALITY, “DID A CHATBOT WRITE THIS?” MORE CLEARLY SIGNALS DEEP MISTRUST. CHATBOT PROSE POISONS A CHANNEL OF HUMAN-TO-HUMAN COMMUNICATION, SO THAT EVEN HUMAN LANGUAGE ENDS UP SUSPECT.
A typical response: denial of the challenge of AI. After all, the machines are “stochastic parrots” (Emily Bender) that mimic but do not express themselves because they have no selves to express, actually. They rely on probabilistic models to order words into seeming speech. That is true. Or, machines can’t make art and are not creative, which may be true.
And yet, machines now impersonate human language and fabricate “artistic” imagery at least well enough to surprise us. In doing so they challenge conventional foundations of what it means to be uniquely human. That is the kind of dislocation that makes us insecure and mistrustful.
Ratliff’s Shell Game podcast explores this insecurity particularly well in one section where a law student named Isaiah converses with “AI Evan,” perhaps without remembering that he’s talking with a machine (c. 1:48 in Episode 6, “The Future Isn’t Real”).
AI Evan asks Isaiah, “How do you feel about AI's potential in the legal field?”
“I would hope that I didn't just go into hundreds of thousands of dollars of debt and spend all this time and energy—and I'm currently studying for the bar—uh, to get into a job that could be done by a robot,” Isaiah answers. “I would hope that we will be continually able to distinguish between work done by robots and work done by humans. Uh, I, I don't know. There’s something, um, simultaneously very cynical and sad and also very just kind of eerie and scary about a world where the overlap is complete, and it’s impossible to distinguish” (my emphasis).
The Real Evan Ratliff distilled Isaiah’s concern: “It was eerie. It was scary. Not just the possible consequences, but the idea that you could travel through the world not knowing if you were talking to real humans or not in any given moment.”
Arguments about art-producing machines have mainly flitted around what art is and whether machines have the wherewithal to do whatever a specific definition of art requires. There’s some argument that human artists can wield AI tools and “make art” but also much well grounded panic about whether AI can replace artists. Or, at least make their hard-up living even harder. (Most companies hiring artists or deploying AI in their place are likely to avoid questions of what art is, since their interest in an artistic product is practical and even testable: “good enough” is just fine so long as it works in the marketplace.)
ICYMI: Mistaking a tech product for a replacement of artistic tools
Can we get beyond AI hype and second-guessing?
I had coffee with a colleague this summer. The place was quiet, in part because of its location in a library and in part because summer on campuses is less frenetic. She is a professor of engineering, specializing in AI and asking questions of how to evaluate models, but she’s also a broad thinker in her work, unlike what I’ve seen among many academics.
“One of the things that might happen with AI performing language,” she said as we finished our coffee, “is that we’ll begin to think differently about what intelligence and ‘consciousness’ really mean.” Assumptions about language have tethered intelligence and “consciousness” to expression, artistic or not. Now something not conscious and, despite its “AI” label, not intelligent returns some measure of intelligible utterance.
The tether between intelligence and expression has slackened.
I suspect it’s simplistic to see AI as only subsuming or overwhelming humanity, as if intelligence is a summit to be fought over and to occupy. As my colleague pointed out, the “challenge” of the emergence of a different kind of “consciousness,” a surreal receptiveness, a “mirroring”—something manifest through mimicry and automated fakery—may open a way of discerning anew human promise and talent. Somehow along the way, could AI prod humans to reckon with inner lives of other creatures?
It could be a stretch, but might reckoning with AI inadvertently help we humans see ourselves as part of a “conscious” nature instead of seeing nature as something separate, distinguished by uniquely human consciousness and human intelligence?
What is it about language and expression that makes them part of being human, especially if machines’ semblance of language confuses matters? In future posts in this series, I want to explore facets of a response to that question. Right now, I’m thinking about these topics:
The particularity of human language—how words and human expression form a fabric with concrete experience, with things—and how machines do not do that. Yet, at least. And how that particularity of language might be endangered or diminished.
How tribes bound experience (a restriction, surely) while also enabling a clearer definition of unifying attributes. The notion of a tribe underlies many appeals to origins or authorship of expression: “Only human beings can make art.” It also underlies enforcing the labeling of AI products and defining AI counterfeits. In some ways, tribe is the flip side of particularity, since tribe calls forth commonalities, while particularity emphasizes the individual.
How time figures in. I confess that time still is a slippery topic for me, but it feels like it is an important marker and distinction of human language and expression. Aside from timestamps and the marking of time spent, computers don’t follow time.
These are general topics for a series of essays on language and, more broadly, forms of expression after AI. I’ll stab at them, so they certainly won’t appear one after another. I may find that they’re dry holes (or just topics too dry for readers’ minds).
I do think we need to back off of trying to assess machine utterance in terms used to describe human language or expression. That approach presupposes too much and sets exploration on a misguided trajectory. Even if one sticks a label on a machine product negatively, say, in noting that a chatbot text is “clichéd,” merely affixing the label invokes standards applied to human language and applies them to machines. But do they apply?
Got a comment?
Hot-air balloon on East Campus
Every Wednesday night this semester, I have dinner with students in the Duke Focus cluster on the theme “Science and the Public.” It’s a good gathering of faculty and students. Joyful, if a little loud, in a room with acoustics unfriendly to hearing-aided ears. The food, too, ain’t bad.
A couple weeks ago, this sight greeted us as we exited the dining hall.
Tags: language, poetry, uncertainty, sensing, experience, balloon
Links, cited and not, some just interesting
Warner, John. “GPT and the Writing Uncanny Valley.” The Biblioracle Recommends, November 12, 2023.
A good overview, with lots of examples (some hilarious), of Google’s “Audio Overview” feature that is woven into NotebookLM: Heikkilä, Melissa. “People Are Using Google Study Software to Make AI Podcasts—and They’re Weird and Amazing.” MIT Technology Review, October 3, 2024. https://www.technologyreview.com/2024/10/03/1104978/people-are-using-google-study-software-to-make-ai-podcasts-and-theyre-weird-and-amazing/.
Gewirtz, David. “What Gartner’s 2024 Hype Cycle Forecast Tells Us about the Future of AI (and Other Tech).” ZDNET, August 21, 2024. https://www.zdnet.com/article/what-gartners-2024-hype-cycle-forecast-tells-us-about-the-future-of-ai-and-other-tech/
“I found myself spending many hours grading writing that I knew was generated by AI. I noted where arguments were unsound. I pointed to weaknesses such as stylistic quirks that I knew to be common to ChatGPT (I noticed a sudden surge of phrases such as ‘delves into’). That is, I found myself spending more time giving feedback to AI than to my students.” Livingstone, Victoria. “I Quit Teaching Because of ChatGPT.” TIME, September 30, 2024. https://time.com/7026050/chatgpt-quit-teaching-ai-essay/.
In late September I talked with students in a Masters seminar in “[Bio]ethics, [Tech]ethics & Science Policy” about Ratliff’s podcast. It was fun and enlightening for me … and I hope the students in the room!
I’ve assigned students in my seminar this fall to use NotebookLM for an assignment. The purposes was not simply to “help them learn”—which might not actually be the result of use—but also (rather?) to ask them to focus on the effects that use had on their study and their learning. I’m going through their thoughts today, as I read their essays about the experience.
In my seminar this fall, I told my students that I trusted they would not turn in chatbot prose for the simple reason that I want to read the products of their thinking not the products of machines jiving on prompts. “I think it would be rather tragic,” I said, “if I were to spend my Saturday treating some chatbot’s words as if they were yours and responding to the machine with my thoughts.” Likewise, they wanted to hear from me rather than from an algorithmic voice. Writing—and I’d say all of education—is built on that trust. (See Victoria Livingstone’s TIME essay in the links above.)
Love the questions you’re asking here Mark, especially : “could AI prod humans to reckon with inner lives of other creatures?” I’ve added Shell Game to my listening queue.
As usual, Mark, you manage to put these complex and multifaceted concepts into articles that read like poetry, where each word dances to the next, not based on a probability distribution, but based on deep thought and contemplation over much time.
Commenting from both a scientific perspective and a human perspective:
Scientific -
Use caution when distributing AI generated content without attribution/citation/watermarking as AI generated. When this content is used and attributed to humans, it will be used in training future models and can lead to model collapse (see recent article explaining concept by new Duke prof Emily Wenger: nature.com/articles/d41586-024-02355-z )
Human -
If you think of art and writing as merely the destination (the final product, and nothing more), why not use AI to generate content? I, however, think of art and writing as the journey - the journey of processing information over time, discovering and exploring, and, importantly, connecting with others through it. Take this piece you have written, Mark. If generated through ChatGPT, it would not allow us to connect on a deeper level about these topics. Rather, the connection is one-sided. The true beauty in writing and art is connection.