feat(lib): use new challenge creation flow (#749)

* feat(decaymap): add Delete method

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(lib/challenge): refactor Validate to take ValidateInput

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib): implement store interface

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): all metapackage to import all store implementations

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(policy): import all store backends

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib): use new challenge creation flow

Previously Anubis constructed challenge strings from request metadata.
This was a good idea in spirit, but has turned out to be a very bad idea
in practice. This new flow reuses the Store facility to dynamically
create challenge values with completely random data.

This is a fairly big rewrite of how Anubis processes challenges. Right
now it defaults to using the in-memory storage backend, but on-disk
(boltdb) and valkey-based adaptors will come soon.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(decaymap): fix documentation typo

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(lib): fix SA4004

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/store): make generic storage interface test adaptor

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(decaymap): invert locking process for Delete

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): add bbolt store implementation

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: go mod tidy

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(devcontainer): adapt to docker compose, add valkey service

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): make challenges live for 30 minutes by default

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): implement valkey backend

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/store/valkey): disable tests if not using docker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/policy/config): ensure valkey stores can be loaded

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for Xe/store-interface

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* chore(devcontainer): remove port forwards because vs code handles that for you

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(default-config): add a nudge to the storage backends section of the docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(docs): listen on 0.0.0.0 for dev container support

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(policy): document storage backends

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG and internal links

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(admin/policies): don't start a sentence with as

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: fixes found in review

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
This commit is contained in:
Xe Iaso 2025-07-04 20:42:28 +00:00 committed by GitHub
parent 506d8817d5
commit dff2176beb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
43 changed files with 1539 additions and 140 deletions

View file

@ -22,6 +22,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add option for custom cookie prefix ([#732](https://github.com/TecharoHQ/anubis/pull/732))
- Add translation for German language ([#741](https://github.com/TecharoHQ/anubis/pull/741))
- Remove the "Success" interstitial after a proof of work challenge is concluded.
- Anubis now has the concept of [storage backends](./admin/policies.mdx#storage-backends). These allow you to change how Anubis stores temporary data (in memory, on the disk, or in Valkey). If you run Anubis in an environment where you have a low amount of memory available for Anubis (eg: less than 64 megabytes), be sure to configure the [`bbolt`](./admin/policies.mdx#bbolt) storage backend.
- The challenge issuance and validation process has been rewritten from scratch. Instead of generating challenge strings from request metadata (under the assumption that the values being compared against are stable), Anubis now generates random data for each challenge. This data is stored in the active [storage backend](./admin/policies.mdx#storage-backends) for up to 30 minutes. Fixes [#564](https://github.com/TecharoHQ/anubis/issues/564), [#746](https://github.com/TecharoHQ/anubis/issues/746), and other similar instances of this issue.
- Add option for forcing a specific language ([#742](https://github.com/TecharoHQ/anubis/pull/742))
- Add translation for Turkish language ([#751](https://github.com/TecharoHQ/anubis/pull/751))
- Allow [Common Crawl](https://commoncrawl.org/) by default so scrapers have less incentive to scrape

View file

@ -237,6 +237,115 @@ remote_addresses:
Anubis has support for showing imprint / impressum information. This is defined in the `impressum` block of your configuration. See [Imprint / Impressum configuration](./configuration/impressum.mdx) for more information.
## Storage backends
Anubis needs to store temporary data in order to determine if a user is legitimate or not. Administrators should choose a storage backend based on their infrastructure needs. Each backend has its own advantages and disadvantages.
Anubis offers the following storage backends:
- [`memory`](#memory) -- A simple in-memory hashmap
- [`bbolt`](#bbolt) -- An on-disk key/value store backed by [bbolt](https://github.com/etcd-io/bbolt), an embedded key/value database for Go programs
- [`valkey`](#valkey) -- A remote in-memory key/value database backed by [Valkey](https://valkey.io/) (or another database compatible with the [RESP](https://redis.io/docs/latest/develop/reference/protocol-spec/) protocol)
If no storage backend is set in the policy file, Anubis will use the [`memory`](#memory) backend by default. This is equivalent to the following in the policy file:
```yaml
store:
backend: memory
parameters: {}
```
### `memory`
The memory backend is an in-memory cache. This backend works best if you don't use multiple instances of Anubis or don't have mutable storage in the environment you're running Anubis in.
| Should I use this backend? | Yes/no |
| :------------------------------------------------------------ | :----- |
| Are you running only one instance of Anubis for this service? | ✅ Yes |
| Does your service get a lot of traffic? | 🚫 No |
| Do you want to store data persistently when Anubis restarts? | 🚫 No |
| Do you run Anubis without mutable filesystem storage? | ✅ Yes |
The biggest downside is that there is not currently a limit to how much data can be stored in memory. This will be addressed at a later time.
#### Configuration
The memory backend does not require any configuration to use.
### `bbolt`
An on-disk storage layer powered by [bbolt](https://github.com/etcd-io/bbolt), a high performance embedded key/value database used by containerd, etcd, Kubernetes, and NATS. This backend works best if you're running Anubis on a single host and get a lot of traffic.
| Should I use this backend? | Yes/no |
| :------------------------------------------------------------ | :----- |
| Are you running only one instance of Anubis for this service? | ✅ Yes |
| Does your service get a lot of traffic? | ✅ Yes |
| Do you want to store data persistently when Anubis restarts? | ✅ Yes |
| Do you run Anubis without mutable filesystem storage? | 🚫 No |
When Anubis opens a bbolt database, it takes an exclusive lock on that database. Other instances of Anubis or other tools cannot view the bbolt database while it is locked by another instance of Anubis. If you run multiple instances of Anubis for different services, give each its own `bbolt` configuration.
#### Configuration
The `bbolt` backend takes the following configuration options:
| Name | Type | Example | Description |
| :------- | :----- | :----------------- | :-------------------------------------------------------------------------------------------------------------------------------- |
| `bucket` | string | `anubis` | The bbolt bucket that Anubis should place all its data into. If this is not set, then Anubis will default to the bucket `anubis`. |
| `path` | path | `/data/anubis.bdb` | The filesystem path for the Anubis bbolt database. Anubis requires write access to the folder containing the bbolt database. |
Example:
If you have persistent storage mounted to `/data`, then your store configuration could look like this:
```yaml
store:
backend: bbolt
parameters:
path: /data/anubis.bdb
```
### `valkey`
[Valkey](https://valkey.io/) is an in-memory key/value store that clients access over the network. This allows multiple instances of Anubis to share information and does not require each instance of Anubis to have persistent filesystem storage.
:::note
You can also use [Redis](http://redis.io/) with Anubis.
:::
This backend is ideal if you are running multiple instances of Anubis in a worker pool (eg: Kubernetes Deployments with a copy of Anubis in each Pod).
| Should I use this backend? | Yes/no |
| :------------------------------------------------------------ | :----- |
| Are you running only one instance of Anubis for this service? | 🚫 No |
| Does your service get a lot of traffic? | ✅ Yes |
| Do you want to store data persistently when Anubis restarts? | ✅ Yes |
| Do you run Anubis without mutable filesystem storage? | ✅ Yes |
| Do you have Redis or Valkey installed? | ✅ Yes |
#### Configuration
The `valkey` backend takes the following configuration options:
| Name | Type | Example | Description |
| :---- | :----- | :---------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------- |
| `url` | string | `redis://valkey:6379/0` | The URL for the instance of Redis or Valkey that Anubis should store data in. This is in the same format as `REDIS_URL` in many cloud providers. |
Example:
If you have an instance of Valkey running with the hostname `valkey.int.techaro.lol`, then your store configuration could look like this:
```yaml
store:
backend: valkey
parameters:
url: "redis://valkey.int.techaro.lol:6379/0"
```
This would have the Valkey client connect to host `valkey.int.techaro.lol` on port `6379` with database `0` (the default database).
## Risk calculation for downstream services
In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers:

View file

@ -2,7 +2,7 @@
title: Why does Anubis use Proof-of-Work?
---
Anubis uses a [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.
Anubis uses [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.
## How Anubis' proof of work scheme works
@ -21,16 +21,3 @@ const hash = await sha256(`${challenge}${nonce}`);
In order to pass a challenge, the `hash` has to have the right number of leading zeros (the "difficulty"). When a client requests to pass the challenge, they include the nonce they used. The server then only has to do one sha256 operation: the one that confirms that the challenge (generated from request metadata) and the nonce (provided by the client) match the difficulty number of leading zeroes.
Ultimately, this is a hack whose real purpose is to give a "good enough" placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to known legitimate users.
## Challenge format
Anubis generates challenges based on browser metadata, including but not limited to the following:
- The contents of your [`Accept-Language` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept-Language)
- The IP address of your client
- Your browser's [`User-Agent` string](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent)
- The date of the current week, rooted on Sundays
- Anubis' ed25519 public signing key for [JSON web tokens](https://jwt.io/) (JWTs)
- The challenge difficulty
This is intended to be a random value that is difficult for attackers to forge and guess, but also deterministic enough that it will naturally reset itself.

View file

@ -4,7 +4,7 @@
"private": true,
"scripts": {
"docusaurus": "docusaurus",
"start": "docusaurus start",
"start": "docusaurus start --host 0.0.0.0",
"build": "docusaurus build",
"swizzle": "docusaurus swizzle",
"deploy": "echo 'use CI' && exit 1",
@ -45,4 +45,4 @@
"engines": {
"node": ">=18.0"
}
}
}