feat(lib): use new challenge creation flow (#749)

* feat(decaymap): add Delete method Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(lib/challenge): refactor Validate to take ValidateInput Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib): implement store interface Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib/store): all metapackage to import all store implementations Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(policy): import all store backends Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib): use new challenge creation flow Previously Anubis constructed challenge strings from request metadata. This was a good idea in spirit, but has turned out to be a very bad idea in practice. This new flow reuses the Store facility to dynamically create challenge values with completely random data. This is a fairly big rewrite of how Anubis processes challenges. Right now it defaults to using the in-memory storage backend, but on-disk (boltdb) and valkey-based adaptors will come soon. Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(decaymap): fix documentation typo Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(lib): fix SA4004 Signed-off-by: Xe Iaso <me@xeiaso.net> * test(lib/store): make generic storage interface test adaptor Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: spelling Signed-off-by: Xe Iaso <me@xeiaso.net> * fix(decaymap): invert locking process for Delete Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib/store): add bbolt store implementation Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: spelling Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: go mod tidy Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(devcontainer): adapt to docker compose, add valkey service Signed-off-by: Xe Iaso <me@xeiaso.net> * fix(lib): make challenges live for 30 minutes by default Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib/store): implement valkey backend Signed-off-by: Xe Iaso <me@xeiaso.net> * test(lib/store/valkey): disable tests if not using docker Signed-off-by: Xe Iaso <me@xeiaso.net> * test(lib/policy/config): ensure valkey stores can be loaded Signed-off-by: Xe Iaso <me@xeiaso.net> * Update metadata check-spelling run (pull_request) for Xe/store-interface Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com> on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev> * chore(devcontainer): remove port forwards because vs code handles that for you Signed-off-by: Xe Iaso <me@xeiaso.net> * docs(default-config): add a nudge to the storage backends section of the docs Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(docs): listen on 0.0.0.0 for dev container support Signed-off-by: Xe Iaso <me@xeiaso.net> * docs(policy): document storage backends Signed-off-by: Xe Iaso <me@xeiaso.net> * docs: update CHANGELOG and internal links Signed-off-by: Xe Iaso <me@xeiaso.net> * docs(admin/policies): don't start a sentence with as Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: fixes found in review Signed-off-by: Xe Iaso <me@xeiaso.net> --------- Signed-off-by: Xe Iaso <me@xeiaso.net> Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
2025-07-04 20:42:28 +00:00 · 2025-07-04 20:42:28 +00:00 · dff2176beb
commit dff2176beb
parent 506d8817d5
43 changed files with 1539 additions and 140 deletions
--- a/docs/docs/CHANGELOG.md
+++ b/docs/docs/CHANGELOG.md
@ -22,6 +22,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Add option for custom cookie prefix ([#732](https://github.com/TecharoHQ/anubis/pull/732))
 - Add translation for German language ([#741](https://github.com/TecharoHQ/anubis/pull/741))
 - Remove the "Success" interstitial after a proof of work challenge is concluded.
+- Anubis now has the concept of [storage backends](./admin/policies.mdx#storage-backends). These allow you to change how Anubis stores temporary data (in memory, on the disk, or in Valkey). If you run Anubis in an environment where you have a low amount of memory available for Anubis (eg: less than 64 megabytes), be sure to configure the [`bbolt`](./admin/policies.mdx#bbolt) storage backend.
+- The challenge issuance and validation process has been rewritten from scratch. Instead of generating challenge strings from request metadata (under the assumption that the values being compared against are stable), Anubis now generates random data for each challenge. This data is stored in the active [storage backend](./admin/policies.mdx#storage-backends) for up to 30 minutes. Fixes [#564](https://github.com/TecharoHQ/anubis/issues/564), [#746](https://github.com/TecharoHQ/anubis/issues/746), and other similar instances of this issue.
 - Add option for forcing a specific language ([#742](https://github.com/TecharoHQ/anubis/pull/742))
 - Add translation for Turkish language ([#751](https://github.com/TecharoHQ/anubis/pull/751))
 - Allow [Common Crawl](https://commoncrawl.org/) by default so scrapers have less incentive to scrape
--- a/docs/docs/admin/policies.mdx
+++ b/docs/docs/admin/policies.mdx
@ -237,6 +237,115 @@ remote_addresses:

 Anubis has support for showing imprint / impressum information. This is defined in the `impressum` block of your configuration. See [Imprint / Impressum configuration](./configuration/impressum.mdx) for more information.

+## Storage backends
+
+Anubis needs to store temporary data in order to determine if a user is legitimate or not. Administrators should choose a storage backend based on their infrastructure needs. Each backend has its own advantages and disadvantages.
+
+Anubis offers the following storage backends:
+
+- [`memory`](#memory) -- A simple in-memory hashmap
+- [`bbolt`](#bbolt) -- An on-disk key/value store backed by [bbolt](https://github.com/etcd-io/bbolt), an embedded key/value database for Go programs
+- [`valkey`](#valkey) -- A remote in-memory key/value database backed by [Valkey](https://valkey.io/) (or another database compatible with the [RESP](https://redis.io/docs/latest/develop/reference/protocol-spec/) protocol)
+
+If no storage backend is set in the policy file, Anubis will use the [`memory`](#memory) backend by default. This is equivalent to the following in the policy file:
+
+```yaml
+store:
+  backend: memory
+  parameters: {}
+```
+
+### `memory`
+
+The memory backend is an in-memory cache. This backend works best if you don't use multiple instances of Anubis or don't have mutable storage in the environment you're running Anubis in.
+
+| Should I use this backend?                                    | Yes/no |
+| :------------------------------------------------------------ | :----- |
+| Are you running only one instance of Anubis for this service? | ✅ Yes |
+| Does your service get a lot of traffic?                       | 🚫 No  |
+| Do you want to store data persistently when Anubis restarts?  | 🚫 No  |
+| Do you run Anubis without mutable filesystem storage?         | ✅ Yes |
+
+The biggest downside is that there is not currently a limit to how much data can be stored in memory. This will be addressed at a later time.
+
+#### Configuration
+
+The memory backend does not require any configuration to use.
+
+### `bbolt`
+
+An on-disk storage layer powered by [bbolt](https://github.com/etcd-io/bbolt), a high performance embedded key/value database used by containerd, etcd, Kubernetes, and NATS. This backend works best if you're running Anubis on a single host and get a lot of traffic.
+
+| Should I use this backend?                                    | Yes/no |
+| :------------------------------------------------------------ | :----- |
+| Are you running only one instance of Anubis for this service? | ✅ Yes |
+| Does your service get a lot of traffic?                       | ✅ Yes |
+| Do you want to store data persistently when Anubis restarts?  | ✅ Yes |
+| Do you run Anubis without mutable filesystem storage?         | 🚫 No  |
+
+When Anubis opens a bbolt database, it takes an exclusive lock on that database. Other instances of Anubis or other tools cannot view the bbolt database while it is locked by another instance of Anubis. If you run multiple instances of Anubis for different services, give each its own `bbolt` configuration.
+
+#### Configuration
+
+The `bbolt` backend takes the following configuration options:
+
+| Name     | Type   | Example            | Description                                                                                                                       |
+| :------- | :----- | :----------------- | :-------------------------------------------------------------------------------------------------------------------------------- |
+| `bucket` | string | `anubis`           | The bbolt bucket that Anubis should place all its data into. If this is not set, then Anubis will default to the bucket `anubis`. |
+| `path`   | path   | `/data/anubis.bdb` | The filesystem path for the Anubis bbolt database. Anubis requires write access to the folder containing the bbolt database.      |
+
+Example:
+
+If you have persistent storage mounted to `/data`, then your store configuration could look like this:
+
+```yaml
+store:
+  backend: bbolt
+  parameters:
+    path: /data/anubis.bdb
+```
+
+### `valkey`
+
+[Valkey](https://valkey.io/) is an in-memory key/value store that clients access over the network. This allows multiple instances of Anubis to share information and does not require each instance of Anubis to have persistent filesystem storage.
+
+:::note
+
+You can also use [Redis](http://redis.io/) with Anubis.
+
+:::
+
+This backend is ideal if you are running multiple instances of Anubis in a worker pool (eg: Kubernetes Deployments with a copy of Anubis in each Pod).
+
+| Should I use this backend?                                    | Yes/no |
+| :------------------------------------------------------------ | :----- |
+| Are you running only one instance of Anubis for this service? | 🚫 No  |
+| Does your service get a lot of traffic?                       | ✅ Yes |
+| Do you want to store data persistently when Anubis restarts?  | ✅ Yes |
+| Do you run Anubis without mutable filesystem storage?         | ✅ Yes |
+| Do you have Redis or Valkey installed?                        | ✅ Yes |
+
+#### Configuration
+
+The `valkey` backend takes the following configuration options:
+
+| Name  | Type   | Example                 | Description                                                                                                                                      |
+| :---- | :----- | :---------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------- |
+| `url` | string | `redis://valkey:6379/0` | The URL for the instance of Redis or Valkey that Anubis should store data in. This is in the same format as `REDIS_URL` in many cloud providers. |
+
+Example:
+
+If you have an instance of Valkey running with the hostname `valkey.int.techaro.lol`, then your store configuration could look like this:
+
+```yaml
+store:
+  backend: valkey
+  parameters:
+    url: "redis://valkey.int.techaro.lol:6379/0"
+```
+
+This would have the Valkey client connect to host `valkey.int.techaro.lol` on port `6379` with database `0` (the default database).
+
 ## Risk calculation for downstream services

 In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers:
--- a/docs/docs/design/why-proof-of-work.mdx
+++ b/docs/docs/design/why-proof-of-work.mdx
@ -2,7 +2,7 @@
 title: Why does Anubis use Proof-of-Work?
 ---

-Anubis uses a [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.
+Anubis uses [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.

 ## How Anubis' proof of work scheme works

@ -21,16 +21,3 @@ const hash = await sha256(`${challenge}${nonce}`);
 In order to pass a challenge, the `hash` has to have the right number of leading zeros (the "difficulty"). When a client requests to pass the challenge, they include the nonce they used. The server then only has to do one sha256 operation: the one that confirms that the challenge (generated from request metadata) and the nonce (provided by the client) match the difficulty number of leading zeroes.

 Ultimately, this is a hack whose real purpose is to give a "good enough" placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to known legitimate users.
-
-## Challenge format
-
-Anubis generates challenges based on browser metadata, including but not limited to the following:
-
- The contents of your [`Accept-Language` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept-Language)
- The IP address of your client
- Your browser's [`User-Agent` string](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent)
- The date of the current week, rooted on Sundays
- Anubis' ed25519 public signing key for [JSON web tokens](https://jwt.io/) (JWTs)
- The challenge difficulty
-
-This is intended to be a random value that is difficult for attackers to forge and guess, but also deterministic enough that it will naturally reset itself.
--- a/docs/package.json
+++ b/docs/package.json
@ -4,7 +4,7 @@
  "private": true,
  "scripts": {
    "docusaurus": "docusaurus",
-    "start": "docusaurus start",
+    "start": "docusaurus start --host 0.0.0.0",
    "build": "docusaurus build",
    "swizzle": "docusaurus swizzle",
    "deploy": "echo 'use CI' && exit 1",
@ -45,4 +45,4 @@
  "engines": {
    "node": ">=18.0"
  }
-}
+}