website/content/blog/lrs-antispam.md

+++
title = "Can antibots be both efficient and anonymous?"
date = 2025-09-20
description = "todo"
insert_anchor_links = "left"
draft = true
[taxonomies]
tags = ["cryptography"]
[extra]
katex = true
+++

Some people with a lot of money (or, at least, who control a lot of machines) decided to flood the Internet with useless requests, crawling every website without respecting robots.txt nor anything else. They even made some services shut down, effectively performing a DDoS. All that to train AI with garbage data so you can ask a chatbot to quote Wikipedia or StackOverflow instead of using a proper search engine.

The Internet relies on a heap of protocols that only work well when people behave correctly: it stops being efficient when someone gains too much power (bandwidth and IP addresses). Cloud providers indeed provide bad guys with enough clouds to make a storm, not to talk about the "Internet of Things" that allows botnets to run on security cameras, baby monitors and sextoys. One of the most common practices on the Internet is fundamentally altruistic: giving a copy of a file to whomever is asking for it, for free (what it commonly called "Web"). The problem is that answering such a request consumes a machine's resources (energy, computing time, IO time, memory, etc.), resources that can be exhausted if people are asking too much.

## A few solutions

### Proof of intelligence

Captchas ask the user to solve a problem that is (supposedly) difficult for a computer, but (supposedly) easy for a human. However they take time to solve even for a human, they are not accessible to people who can't see or hear or have mental disabilities, and modern AIs can already solve them.

### Proof of browser

Systems that do not require user input can check whether they are being run in a proper web browser, by testing various features. However they can be fooled by giving more power to the bot's engine, which then becomes indistinguishable from a browser.

### Proof of work

Proof of work imposes to solve problems that are long to solve, but fast to check. A few seconds of computing time is needed to solve the challenge. The difficulty must be well balanced so it is fast enough for a legitimate user, but too expensive for a spammer who sends thousands of requests per second. However this appears not to frighten spammers anymore, as Anubis (an antispam system based on proof of work) failed to stop some attacks.

### Global monitoring

If you are big enough to have a global database of real time traffic per IP address (e.g. CloudFlare, Amazon, Google, etc.), you can detect spammy addresses and stop them immediately. However such a centralized solution is not acceptable, as it gives too much power to gigantic corporations. Decentralized and anonymous spam databases may be an interesting research subject but it seems quite complicated.

##