site logo

Is The Downfall of Cloudflare Nigh?

By Bob Leggitt  |  19 May 2022

"The nemesis for Cloudflare could prove to be the very thing it claims it's there to protect against. What would happen if bots reached critical mass?..."

Cloudflare. It presents itself as a protection mechanism, defending websites against attackers and other unwanted visitors. But a growing group of digital freedom-fighters consider Cloudflare a dire threat to both privacy and liberty.

You see, Cloudflare doesn't just block bots from accessing websites. It also blocks humans who won't submit to its conditions of entry - which, for those with high privacy standards, necessarily involve disabling personal protections. So Cloudflare is already a demonstrable privacy threat. But if it infiltrates enough of the Web, says a hum of speculation, it could also come to serve as a dystopian system of authoritarian, digital passport control, deciding who is, and who is not, allowed to use the Internet. And the conditions we're required to satisfy for entry could inflate beyond all previous frames of reference.

At present, the balance is still in our favour. Cloudflare doesn't yet control enough of the Web to assert an endgame. And there are other power-crazed, corporate monoliths who would not willingly let Cloudflare have sole custody of the online gatekeeping system. For example, Cloudflare would not, in the current scheme of things, get away with anything that negatively affected Google's bottom line or hindered Google's ability to farm content and suction-pump data.

The solution, for Cloudflare, would not be not as simple as just handing Google a set of keys. Google's business depends not only on Google accessing third-party properties, but also on the public accessing those third-party properties. If 10% of people can't access the third-party Web properties from which Google makes money, that's still punching a huge hole in Google's pocket. Google stands at the helm of browser technology, with the power to shut down any existential threat. Cloudflare could definitely not afford to bash that kind of dent into the company's revenue stream.

So Cloudflare may not be able to assume gatekeepership of the Internet quite as soon as it would like. But that doesn't mean it's not a problem. It's already discriminating against vulnerable minorities. Its "pRoVe YoU'rE nOt A bOt" access tests have been shown to be impossible for people with some disabilities to solve. Cloudflare is a company that can find a way to let Google's bots through its imposing doors, yet not extend the same courtesy to people with disabilities. You don't need to be psychic to understand that affording unchecked power to a company like that would not be particularly good for humankind.

We may, however, be seeing an impending exit for Cloudflare, and for "pRoVe YoU'rE nOt A bOt" gatekeeping in general, as convenience culture spills over into the world of computer programming...

RISE OF THE PERSONAL WEB BOT

If you've been into programming computers for a long time, you may have noticed that over the years it's been getting easier. In the early to mid 1990s, the wonderful QBASIC stood as the ultimate in amateur dev environments. Then visual WYSIWYG programming came along and opened the door to a new breadth of programmers. Since then, Python has normalised the use of purpose-dedicated modules, which, when imported into a program, allow a highly complex procedure to be executed in just two or three lines of simple code.

I speak from experience - as someone who programmed in the workplace using dBASE III Plus in the late '80s, then, on my own PC, progressed through QBASIC and Visual Basic to the front end Web languages. I've more recently turned attention to the back end, server-side languages PHP and SQL, but it was the ease with which I was able to build a Web content-retrieval bot using Python that most took me by surprise.

The bot visits web page addresses from a list, and extracts various relatables like the page title, a description or snippet, a site logo/icon, etc. It then drops the harvested content into a database which forms the back end of a small, personalised, offline search engine. The project was really born out of frustration. I had masses and masses of browser bookmarks and could no longer organise them. So I exported them all to text, and set about putting them into a searchable database. Once outside the confines of a browser, the database has been able to grow more quickly, and with a custom front end it delivers the links and snippets on demand.

No more ferreting through seven different browsers crammed solid with bookmarks. One search, one click. That's it. And now the index can cover a lot more ground. In itself this is a partial liberation from Big Tech. A heavy proportion of our Web searches are only attempts to re-locate content we accessed previously. If everyone had a system like this, Google's throughput would inevitably drop by a hair-raising percentage.

So why doesn't everyone have a system like this? At present, the majority are deterred by the complexity of setting it up. But the thing is, that scraper bot has just 39 lines of code, including the population of all database fields and saving the icons to a folder. True, that's still 38 lines of code too many for someone who's used to Big Tech's one-step convenience culture. But we could, in the relatively near future, see programming become so trivial an operation that the incovenience barrier evaporates.

If programming continues to become more and more simple, and human web-surfing continues to become more and more difficult and/or annoying, programming will eventually cross the convenience threshold. It will be easier to make a custom content-retrieval bot than to play Cloudflare's access games. From that point, the majority of the public could be making their own apps and tools, and content-retrieval bots would inevitably see a meteoric rise in volume. That would necessarily end generalised gatekeeping systems like the one Cloudflare currently has in place.

With some site visits, my bot is unable to gain access. Does that mean I go running to the inaccessible site to add it to my database manually? No. The rejection serves as a nice little Cloudflare filter. I don't want to be visiting sites that are crawling with Cloudflare's spyware. They're welcome to stay out of my database. There's more than enough alternative reading matter coming in from open-access sites.

True, if I were trying to run a commercial search engine, I'd be mad that the bot was constantly facing these roadblocks, and I'd probably be looking for solutions. But the dynamics change when the system is only for personal use. It just becomes a case of "out of sight, out of mind". You don't need to care about pages that present a "go screw yourself" message. So if a system like this became the norm, the result would be a catastrophic drop in reach for Cloudflare-infiltrated sites, which would prompt their admins to ditch Cloudflare.

So the nemesis for Cloudflare could prove to be the very thing it claims it's there to protect against. What would happen if bots reached critical mass? What would happen if most bots were essentially service systems for real human consumption? One suspects that's a question Cloudflare hopes it will never have to answer.


Bob Leggitt.