Main menu

Pages

You get the internet you deserve • The Register

featured image

comment To see where this dump truck is going, let’s first follow the debris trail.

It’s hard to track, but the impacts of the Internet’s content factories, which thrived until about 2010, are still readily visible.

The net effect of content generated at a rate of ten to thirty pieces a day on specialized topics – all in the hands of non-experts through the guiding hand of Google Trends – has led to an internet that, in 2010, was saturated with cute (if not sense), keyword stuffed articles that offered little usable information, and in many cases a lot of advice and information that was simply incorrect.

And how did content factories naturally spawn more content factories (and why not, when the associated content, the worst offender, was sold to Yahoo?🇧🇷 for $100 million), what happened next was inevitable. These new content companies simply mimicked what they found in larger content factories, using the internet at the time as a training set, so to speak. The cycle of bad articles with little detail or worse, inaccurate details was repeated over and over again until it became difficult to distinguish one article from another unless found on one of the few reputable, edited sites.

The name of the game for these early content companies was simply volume🇧🇷 Ad network revenues (Google Adsense, etc.) were already falling in 2005, but with thousands if not millions of articles each generating perhaps three cents a day, the money wasn’t bad. For a content factory with 200,000 articles, that was a good $2M deal with ultra-low overhead. Hosting wasn’t too expensive, web design was easy with open source CMS tools like WordPress, Drupal and others, and most importantly (and ultimately most disastrously) bulk content could be purchased for mere cents per article in offshore stores.

This model meant that the Internet was quickly flooded with poorly written nonsense, much of which can still be searched in its original form or even more poorly rephrased. Google had to start stepping up its game to filter this out and learn how to deliver quality content versus the magic keyword mix that content mills might exploit.

The problems with content factories are clear, especially all these years later, but all on a human scale with the limitations of “slow” writers and keywords. The future presents us with a new question – one that could destroy the way we use the internet forever.

Let’s do some math

Pretend it’s 2006 and you’re in the content production business. You are at the top of your game. You have a team of 100 writers in India earning the equivalent of $10 a day to write and post twenty 400 word articles (topics dictated by Google keyword trending data versus experience, etc.).

Your daily salary costs are around $1000. Every day, their content factory publishes 2,000 pieces of “unique” content 365 days a year, and each of those articles assuming good search engine ranking (which could easily be manipulated with keyword tricks at the time) , each of these articles will earn you three cents a day.

And although we’re using rounded numbers for ease, consider these annual numbers (annual because you only have to run this business for a year, the Adsense money comes anyway, at least for a while):

Salary for writers who create, publish and tag 20 articles a day is $365,000 per year. They generate 730,000 pieces of content valued at $10.95 per piece over the course of the year (assuming three cents a day for 365 days). And all of that, which is pretty handy for you, Western Content Lord, means you have a business that generates about $8 million a year.

Oh. But you have to subtract hosting and such. Let’s call that five thousand. The big ugly cost? All those “expensive” writers. And you think to yourself, who needs them🇧🇷

Well, you don’t.

Because, man, there is a new business model for content factories. And while its early 2000s predecessors made the Internet annoying and full of pointless articles that hit keyword and word count targets without saying anything, this one is disturbing enough to turn the Internet into complete garbage. And not just garbage from a content perspective, but also from how the internet business works.

Putting the S in IoS

This new business model is already unfolding. You’ve probably read many articles generated by GPT or similar AI models. The reason you probably didn’t notice is because they aren’t bad. It’s completely you think they’re not bad, but that’s because you’ve been weaned on the Internet of Shit (IoS) that the content mills created, which trained us to lower our expectations when it came to information consumption.

The problem is that these AI-generated articles need to get their information from somewhere in sufficient volume to properly produce new clones of information cloaked in slightly more eloquent language. And where do AI training algorithms get all this from? From iOS, of course.

If we do more math, let’s assume that 10% of the IoS-derived training data contains factual errors. As the AI ​​trains, retrains, and retrains, these errors increase. And ride. And multiply it and within a decade of retraining on bad, weird, oddly worded, increasingly incomprehensible data, we’re left with a truly IoS.

And math is super important again – as is volume.

A single content factory operator on the scale of Western Content Lord, for example, can use free tools to generate content as fast as human operators can connect it with a simple call to action. That same team of 100 workers can bring in 300 pieces a day.

They don’t write, they just ask ChatGPT. They may ask to input keywords as a mold and generate keywords as well, in that case. Eventually this ChatGPT process (as one of many examples) will have API hooks to publish the output directly to WordPress or any other chosen Content Lord CMS.

When the unification of the AI ​​platform for CMS is complete, so will the circle: the Internet is just talking to itself.

The race to the bottom

What Western Content Lord and competitors don’t realize is how quickly this race to the bottom will begin – and soon.

Google Adsense and every other ad network on the planet will recognize the flood and reduce what you pay for a click or a view to next to nothing. And then it will be nothing, but not before Google and others scramble to blacklist well-known AI content mills. But there will be a lot of them popping up very quickly. It will be easier for Google, for example, to create a safe list of known publishers backed by hardworking humans.

Great, you think, balance has been restored🇧🇷 Not a lot.

Keeping up with all the search innovations that push these IoS results to the bottom will cost Google money, AI training on billion-dollar scales, and considerable and frequent retraining of the Internet corpus. Will this corpus get infected fast and furious and how do the search giants pay for all this search innovation? Through advertising revenue.

Search advertising giants like Google may seem to hold their nose and accept queued content mill results because it is in their economic interest to do so. But what if the “acceptable” content set decreases by 95%?

The exponential rate of shit on the internet

We return again to the subject of mathematics and volume and the like to address the most important point: the danger of information is an exponential problem. A series of errors generated and repeated by content factories for a decade means that these problems are trained on the core AI language model of the Internet corpus and reinforced.

It’s one thing to live in an age of fake news, in part because, to most thinking people, it’s obviously fake. When the internet repeats an error often enough, it becomes the truth, and that is the most insidious accidental result of it all.

It would make me, personally, feel better to end this piece with some sort of “fight the power” message, but honestly, at this point, the cat is out of the bag. Content factories can be satisfied with revenues per article that are measured on a value plan over five years and can amount to as little as 0.05 cents over the term. But who cares, right? It’s free money. Hosting is cheap, a CMS is free, and as long as there’s money for ads, it’s worth the passive income effort.

This is the internet you deserve, apparently. 🇧🇷

Comments