Love Street Counselling's blog

Ramblings and side notes with the occasional foray into a mental health topic

robots reading

We’re at Peak AI

Apparently we’re reaching peak AI soon! The large language models have exhausted all the available written material on the internet, and AI companies are struggling to find alternative data sources with genuine human writing.

Now anyone who’s spent any time at all on the internet in the last year will be aware of how prevalent AI generated content has become. A lot of times this is not very easy to distinguish from human generated material, so it’s going to get swallowed up by the datasets that AI companies use, resulting in a bit of an Ouroboros situation. No, not the virus from Resident Hill or whatever, I’m talking about the worm that eats its own tail. Seeing as the Large Language Models (LLMs) are typically trained on the “Delta” of internet scaped data – the difference between the last training dataset and the new one, the newly added material that has appeared on the internet, quite a large chunk of the data that AI companies are using to “improve” their models will have been created by the self same models!

Large Language Models are exceedingly good at adapting to whatever new data you feed them. You can take a big trillions of parameters model like ChatGPT and fine-tune it with a dataset of a few dozen questions and answers to make it into a customer service bot that will faithfully follow your company policy for example. It’ll be interesting to see what the result of feeding the model with millions of examples of its own output will result in. Digital BSE? Or just Digital BS?

The optimist in me would like to think this will result in some sort of renaissance for demand for writing and art. Maybe the AI companies will pay good money to have some writers produce an outstanding finetuning dataset? Get some modern Socrates to crank out some Json files with insightful and deep dialogue. On the flipside maybe there’s gonna be a chilling effect and writers and artists will need to find some less easily scraped medium to post their art on, instead of our lovely World Wide Web. Who’s to say?