The policy battle heats up
But the battle of the metaphors may be headed for a convergence... maybe?
Greetings from Boston. After a quick visit to MIT Press I popped into the MIT Museum and encountered this superb disturbing work by Arthur Ganson. With some relevance to our encounter with Large Language Models I think!
Creators are getting more organized about standing up for their rights vis-a-vis the purveyors of generative AI. The Association of American Publishers is just one of the many signatories to the Human Artistry Campaign, launched ten days ago at the South-by-Southwest conference in Austin Texas.
The principles put forward by the campaign are:
1. Technology has long empowered human expression, and AI will be no different
2. Human created works will continue to play an essential role in our lives
3. Use of copyrighted works, and the use of voices and likenesses of professional performers, requires authorization, licensing and compliance with all relevant state and federal laws.
4. Governments should not create new copyright or other IP exemptions that allow AI developers to exploit creators without permission or compensation.
5. Copyright should only protect the unique value of human intellectual creativity.
6. Trustworthiness and transparency are essential to the success of AI and protection of creators
7. Creators’ interests must be represented in policymaking
You can find out more, and sign the petition, here. I encourage you to do so.
Very much in keeping with this spirit, here’s a must read from two young researchers, something I missed when it first came out, a fierce and cogently-written editorial in Wired from Nick Vincent and Hanlin Li, titled, “ChatGPT Stole Your Work. So What Are You Going to Do?”.
This para is particularly relevant to the main audience of this newsletter:
Media companies, whose work is quite important to large language models (LLMs), may also want to consider some of these ideas to restrict generative AI systems from accessing their own content, as these systems are currently getting their crown jewels for free (including, likely, this very op-ed). For instance, Ezra Klein mentioned in a recent podcast that ChatGPT is great at imitating him, probably because it downloaded a whole lot of his articles without asking him or his employer.
And interestingly, we have a new player loudly objecting to their content being used to train AIs: “Microsoft reportedly orders AI chatbot rivals to stop using Bing’s search data”. Turns out Microsoft’s Bing shares search data with a variety of partners, like Duckduckgo, Yahoo and You.com, and these companies have been feeding that data to fine tune their LLM-powered search chat applications. Double standards much?
The dismay around the lack of transparency in OpenAI’s GPT-4 release (the subject of the last newsletter) continues to grow. Gary Marcus’ thinking has reached its logical conclusion: “We must demand transparency, and if we don’t get it, we must contemplate shutting these projects down.”
Regulators, including the US Copyright Office, are indeed mobilizing further. Mark April 19th 1pm EST in your calendar for a USCO-led public “listening session” on AI and copyright in literary works. And thankfully the Office is now taking greater cognizance of the input question.
Even as the political/legal battlelines are being more clearly drawn up, and the stakes continue to ratchet upwards with the further commercialization of the models, the battle of the metaphors is moving in interesting directions. Is there a possible convergence ahead?
Even with the decreased transparency, increased hype and fast pace of introduction of new models, we are also seeing a growing appreciation that Large Language Models bullshit (in the technical, philosophical sense of the word). They manipulate the form of language, without any reference to meaning. The corollary then is that LLMs are merely “remixing” their training data, albeit in a remarkably sophisticated manner. They are indeed, “21st century collage machines”, a kind of elaborate language game (which now anyone can play), based on the works copied into the model. This view has implications for how courts might view their unauthorized copying of training data.
At the launch of GPT-4 CEO Greg Brockman acknowledged that the model still hallucinates a lot and makes frequent reasoning errors. He didn’t try to hide this, but stressed other aspects of the model : its even greater fluency, greater sensitivity to instruction, stronger guardrails and ability to manage much larger context in its operation (as much as 50 pages of content).
The problems can’t be hidden, and even the OpenAI team are starting to admit that simply enlarging the models won’t solve them (previously seen as the solution to all their limitations).
Ilya Sutskever, OpenAI cofounder and chief scientist, made some interesting comments to Craig Smith, former NYT journalist and Eye-on-AI podcaster, when Smith repeated the emerging consensus on LLMs, that they have no understanding of the world. Sutskever made two points:
1. The familiar argument that “you ain’t seen nothing yet”, that progress is just so fast. This is getting a bit tired by now and is feels weaker after the launch of GPT-4.
2. and, quite provocatively, the models do have an understanding of the world, an understanding that derives from language. In this view language encodes, or incorporates or reflects, models of how the world works.1
Let me quote at length:
“I think that learning the statistical regularities is a far bigger deal than meets the eye. Prediction is also a statistical phenomenon. Yet to predict you eventually need to understand the true underlying process that produced the data … you need to understand more and more about the world that produced the data. As our generative models become extraordinarily good, they will have, I claim, a shocking degree of understanding. A shocking degree of understanding of the world. And many of its subtleties. It's not just the world. It is the world as seen through the lens of text. It [the model] tries to learn more and more about the world through a projection of the world on the space of text as expressed by human beings on the internet.
To which I would say, it’s not language or text in the abstract that encodes knowledge of the world, but works by authors which use language to say meaningful things about the world. It’s hard work to align language with reality. It’s one of the things writers are for!
Now here we come to potential convergence. Gary Marcus for the first time in a long time admitted a bit of optimism about the trend of development of LLMs, in an interview with the Economist’s Babbage podcast. He mused that the size of the potential economic prize of reinventing internet search will force the AI companies to bring together new kinds of models with their LLMs, in order to deliver reliability. This will bring about the AI architecture Marcus has always argued for, one that combines LLMs with models with different functions, and maybe even different architectures, including those that use symbolic processing.
Sutskever also sees that LLMs will have to become more complex, combining different sorts of functional models, in particularly growing the power of the Reinforcement Learning with Human Feedback models which (in his view) discipline the LLMs to restrict themselves to reasoning and facts. (For Sutskever, pretrained LLMs understand the world very well, but need to be taught to express that understanding more consistently…) I think that’s probably a little nuts…?
The fact is that the action for some time has been in combining different sorts of deep learning models in different ways, to make the outputs of the raw pretrained LLMs more useful. Marcus and Sutskever have very different goals in mind, and favor different architectures, but they both seek more powerful combinations. Marcus may favor symbolic models while Sutskever may still prefer to stick to deep learning models. But I sense that these differences in their starting points will matter less as we test the models against the need for real-world results.
Well, so much for me trying to characterize a trend in AI development! I couldn’t help myself.
From a copyright perspective it would seem to me now much harder to sustain the naive analogy that “computers learn by copying” and that training a model is the same as teaching a roomful of smart humans, just faster. I’ve actually heard that analogy from IP bureaucrats. Such naive analogies also sustain the pro-fair use argument. They are history now, condemned to the dust bin as human discourse creates a greater shared understanding.
This point, that text corpora encodes some model of the world, and that LLMs may have some kind of access to this, is made very elegantly by MIT researcher Anna Ivanova, in a recent paper, covered in her interview with Sam Charrington on the excellent TWIML podcast, or, quicker, in this Atlantic article.