PoW anti-crawler middle-proxy https://meso.zoai.re/
Find a file
2025-04-19 18:07:50 +02:00
src Fix proof cookie domain 2025-04-19 18:07:50 +02:00
.gitignore Initial commit 2025-03-29 22:15:07 +01:00
Cargo.lock Better error management 2025-04-12 21:41:09 +02:00
Cargo.toml Better error management 2025-04-12 21:41:09 +02:00
example-config.yaml Disable keepalive 2025-04-12 21:04:10 +02:00
LICENSE working base 2025-04-06 00:03:47 +02:00
README.md Disable keepalive 2025-04-12 21:04:10 +02:00
rustfmt.toml working base 2025-04-06 00:03:47 +02:00

Mesozoa

Mesozoa is a small animal living between a reverse-proxy and a server, protecting the server from crawlers by forcing the browser to run proof of work.

It inspects request's HTTP header and passes the socket to the server directly (zero-copy).

Try it online. (remove the cookie mesozoa-proof or change User-Agent to renew the experience)

Why?

Why not Anubis? Because it provides no build instructions and only supports Docker.

Why not using Realm completely? Because the hook system is useless and only allows filtering.

And because it looked like a fun little project.

Install

Build

Install rustup and a nightly Rust toolchain.

cargo build --release

Run

./target/release/mesozoa -c example-config.yaml

Apache config

Note that the reverse-proxy must provide the HTTP header X-Forwarded-For.

Add this to your virtual host:

ProxyPreserveHost On
ProxyRequests Off
ProxyTimeout 600

<Proxy "http://127.0.0.1:8504">
    ProxySet keepalive=Off
</Proxy>

<Location />
    ProxyPass http://127.0.0.1:8504/
</Location>

Note on keepalive: When keepalive is On, connections between Apache and server are re-used, even for requests from different clients. This increases server performance as it reduces connection overhead, but prevents Mesozoa from intercepting HTTP headers. Hence we have to disable keepalive around Mesozoa. This does not prevent using keepalive between Apache and client.

Challenge protocol

Challenge generation

Sent by the server as a cookie.

  • secret: chosen randomly at startup
  • salt: chosen randomly each time
  • timestamp: UNIX time in seconds, 64 bits, big endian
  • ua: User-Agent from request header
  • ip: X-Forwarded-For from request header (client's IP)

set-cookie: mesozoa-challenge=BASE64(salt || timestamp || SHA3-256(secret || salt || timestamp || ip || "/" || ua))

Where BASE64 is URL-safe unpadded.

Challenge verification

Request must contain both cookies mesozoa-challenge and mesozoa-proof.

Server checks challenge is correct and timestamp not too old.

cookie: mesozoa-proof=nonce

hash = SHA2-256(nonce || challenge)

Client must find a nonce matching /[0-9a-zA-Z_-]{8}/ such that hash starts with at least some number of zeros (in binary representation, MSB-first).

Security

Network handling and HTTP parsing

This implementation uses cheap tricks and regexes, is probably not fully compliant to HTTP specs, etc. You should probably not expose it directly to an open network. Please use it behind a safer reverse proxy like Apache or Nginx.

Length-extension attack

SHA3 (used as a MAC in the challenge cookie) is not vulnerable. Values in the hash are either fixed-length, safe, or delimited.

SHA2 (used for PoW) is vulnerable but nonce is at the beginning so this is not a problem.

PoW

I would like a better PoW: memory-bound and ideally non-parallel. Cuckoo seems a good candidate.

Contribution

Patches and forks are welcome. Send an e-mail to tuxmain ât zettascript ðøt org.

If people are interested, I may switch to a public forge like Codeberg.

The "A" in GNU AGPL means that if you host a publicly available instance of a modified version, then you should also make the modified source code available to users. For example, this can take the form of a link to a repository in the challenge page or in the protected website. As the challenge page's source code is directly distributed by the server, you can modify it independently. (unless adding a compiled object, like WASM. Then you have to publish its source.) Configuration file can be modified and kept secret, of course.

License

Support me via LiberaPay

No LLM was used to write this program.

GNU AGPL v3, CopyLeft 2025 Pascal Engélibert (why copyleft?)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, version 3 of the License.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.