Commit graph

71 commits

Author SHA1 Message Date
02b9aebbe5 NUKE
Some checks failed
Docker image builds / build (push) Failing after 4m22s
2026-02-07 14:27:38 +02:00
Timon de Groot
57c0b2b22c
Add IP mapped Perplexity user agents (#1393)
Perplexity has some proper documentation available for their crawlers,
with published IP addresses: https://docs.perplexity.ai/guides/bots.

Signed-off-by: Timon de Groot <timon.degroot@team.blue>
2026-01-15 19:57:31 -05:00
Xe Iaso
6fc2c3c857
docs: document how to import the default config (#1392)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-01-08 16:14:52 +00:00
lif
cee7871ef8
fix: update SSL Labs IP addresses (#1377)
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Jason Cameron <jason.cameron@stanwith.me>
2026-01-01 23:21:31 -05:00
Xe Iaso
a37068a423
fix(default-config): remove browser detection logic (#1360)
Looks like these rules don't work anymore.

Closes: #1353

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-24 02:13:54 +00:00
Xe Iaso
9d9be61c24
fix(default-config): must-accept-rule on browsers only (#1350)
TIL docker clients don't include the Accept header all the time. I would
have thought they did that. Oops.

Closes: #1346

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-19 20:42:24 +00:00
The Ninth
00fa939acf
Implement FCrDNS and other DNS features (#1308)
* Implement FCrDNS and other DNS features

* Redesign DNS cache and methods

* Fix DNS cache

* Rename regexSafe arg

* Alter verifyFCrDNS(addr) behaviour

* Remove unused dnsCache field from Server struct

* Upd expressions docs

* Update docs/docs/CHANGELOG.md

Signed-off-by: Xe Iaso <me@xeiaso.net>

* refactor(dns): simplify FCrDNS logging

* docs: clarify verifyFCrDNS behavior

Add a note to the documentation for `verifyFCrDNS` to clarify that it returns true when no PTR records are found for the given IP address.

* fix(dns): Improve FCrDNS error handling and tests

The `VerifyFCrDNS` function previously ignored errors returned from reverse DNS lookups. This could lead to incorrect passes when a DNS failure (other than a simple 'not found') occurred. This change ensures that any error from a reverse lookup will cause the FCrDNS check to fail.

The test suite for FCrDNS has been updated to reflect this change. The mock DNS lookups now simulate both 'not found' errors and other generic DNS errors. The test cases have been updated to ensure that the function behaves correctly in both scenarios, resolving a situation where two test cases were effectively duplicates.

* docs: Update FCrDNS documentation and spelling

Corrected a typo in the `verifyFCrDNS` function documentation.

Additionally, updated the spelling exception list to include new terms and remove redundant entries.

* chore: update spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-11-26 22:24:45 -05:00
Xe Iaso
4ead3ed16e
fix(config): deprecate the report_as field for challenges (#1311)
* fix(config): deprecate the report_as field for challenges

This was a bad idea when it was added and it is irresponsible to
continue to have it. It causes more UX problems than it fixes with
slight of hand.

Closes: #1310
Closes: #1307
Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(policy): use the new logger for config validation messages

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(admin/thresholds): remove this report_as setting

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-11-25 23:25:17 -05:00
Josh Deprez
316905bf1d
Add Renovate to Docker clients (#1267)
Renovate-bot looks at the container APIs directly to learn about new image versions and digests. The [default User-Agent](https://docs.renovatebot.com/self-hosted-configuration/#useragent) is `Renovate/${renovateVersion} (https://github.com/renovatebot/renovate)`
2025-11-12 03:22:00 +00:00
Xe Iaso
49c9333359
fix(data): add services folder to embedded filesystem (#1259)
* fix(data): add services folder to embedded filesystem

Also includes a regression test to ensure this does not happen again.

Assisted-By: GLM 4.6 via Claude Code

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-11-08 18:08:48 +00:00
Xe Iaso
c7e4cd1032
fix(data/docker-client): allow some more OCI clients through (#1258)
* fix(data/docker-client): allow some more OCI clients through

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for Xe/more-docker-client-programs

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* fix(data/docker-client): add containerd

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
2025-11-08 17:50:56 +00:00
Xe Iaso
b5ead0a68c
fix(data): add ruleset to explicitly allow Docker / OCI clients (#1253)
* fix(data): add ruleset to explicitly allow Docker / OCI clients

Fixes #1252

This is technically a regression as these clients used to work in Anubis
v1.22.0, however it is allowable to make this opt-in as most websites do not
expect to be serving Docker / OCI registry client traffic.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for Xe/gh-1252/docker-registry-client-fix

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* test(docker-registry): export the right envvars

Signed-off-by: Xe Iaso <me@xeiaso.net>

* ci: add simdjson dependency for homebrew node

Signed-off-by: Xe Iaso <me@xeiaso.net>

* ci: install go/node without homebrew

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test: use right github commit variable

Signed-off-by: Xe Iaso <me@xeiaso.net>

* ci: remove simdjson dependency

Signed-off-by: Xe Iaso <me@xeiaso.net>

* ci: install ko with an action

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: add OCI registry caveat docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
2025-11-08 00:17:25 +00:00
Xe Iaso
531e1dd7f4
chore(default-config): remove Tencent Cloud block rule (#1227)
Tencent Cloud's abuse team reached out to me recently and asked for this
rule to be removed. Prior attempts to reach out to them to report
abusive traffic have failed, thus leading to this IP space block as a
last resort to try and maintain uptime for systems administrators.

Unfortunately, it's difficult for Tencent's abuse team to take action if
there is a blanket block like this. Let's see if this doesn't cause too
much grief.
2025-10-31 11:20:04 -04:00
Xe Iaso
c96c229b68
feat(default-config): block tencent cloud by default (#1216)
* feat(default-config): block tencent cloud by default

This is what happens when you don't have an abuse contact.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: update spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-10-24 19:43:42 +00:00
Xe Iaso
00261d049e
fix(default-config): sometimes browsers don't send Upgrade-Insecure-Requests (#1189)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-10-13 18:31:14 +00:00
Xe Iaso
ffbbdce3da
feat: default config macro (#1186)
* feat(data): add default-config macro

Closes #1152

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test: add default-config-macro smoke test

This uses an AI generated python script to diff the contents of the bots
field of the default configuration file and the
data/meta/default-config.yaml file. It emits a patch showing what needs
to be changed.

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-10-13 11:33:16 -04:00
Xe Iaso
c09c86778d
fix(default-config): remove preact challenge (#1184)
* fix(default-config): remove the preact challenge from the default config

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-10-11 09:22:07 -04:00
Xe Iaso
9c47c180d0
fix(default-config): make the default config far less paranoid (#1179)
* test: add httpdebug tool

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(data/clients/git): more strictly match the git client

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(default-config): make the default config far less paranoid

This uses a variety of heuristics to make sure that clients that claim
to be browsers are more likely to behave like browsers. Most of these
are based on the results of a lot of reverse engineering and data
collection from honeypot servers.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2025-10-11 08:48:12 -04:00
Xe Iaso
0e0847cbeb
feat: add 'proof of React' challenge (#1038)
* feat: add 'proof of React' challenge

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenge/preact): use JSX fragments

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenge/preact): ensure that the client waits as long as it needs to

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: fix spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenges/xeact): add noscript warning

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenges/xeact): add default loading message

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenges/xeact): make a UI render without JS

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenges/xeact): use %s here, not %w

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(test/healthcheck): run asset build

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenge/preact): fix build in ci

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2025-08-29 16:09:27 -04:00
Xe Iaso
b0fa256e3e
fix(default-config): also block alibaba cloud (#1005)
* fix(default-config): also block alibaba cloud

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-08-20 23:01:49 +00:00
Xe Iaso
ee55d857eb
fix(default-config): block Huawei Cloud (#1004)
* fix(default-config): block Huawei Cloud

Closes #978

Huawei Cloud has been egregious about its scraping. All attempts to
contact their abuse team have failed. If you work for Huawei Cloud,
please raise this issue internally and get the scraping to just stop.

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-08-20 22:40:07 +00:00
Dryusdan
237a6a98e2
Bump ai.robots.txt to v1.39 (#982) 2025-08-18 06:52:23 -04:00
Elliot Speck
87651f9506
default pattern fixes (#963)
* feat(checker): allow png/gif/jpg/jpeg/svg favicons as well as ico

* changelog: add updates to keep-internet-working.yaml

* fix(checker): tighten default regex patterns for well-known files

* changelog: add updates to regular expression patterns in keep-internet-working.yaml

---------

Signed-off-by: Elliot Speck <11192354+arcayr@users.noreply.github.com>
2025-08-09 07:40:33 -04:00
Elliot Speck
100005ce70
feat(checker): allow png/gif/jpg/jpeg/svg favicons as well as ico (#961)
* feat(checker): allow png/gif/jpg/jpeg/svg favicons as well as ico

* changelog: add updates to keep-internet-working.yaml

---------

Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
Co-authored-by: Xe Iaso <xe.iaso@techaro.lol>
2025-08-08 16:53:23 +00:00
Xe Iaso
7c80c23e90
docs: remove JSON examples from policy file docs (#945)
* docs: remove JSON examples from policy file docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): remove mentions of botPolicies.json in the tests

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update link to challenge methods

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: unbreak links to the challenges category

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-08-03 18:09:26 +00:00
Xe Iaso
bca2e87e80
feat(default-rules): add weight to Custom-AsyncHttpClient (#914)
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2025-07-27 00:41:43 +00:00
Marcel Bischoff
5c4d8480e6
Add services folder, add Uptime Robot policy definition (#838)
Uptime Robot is a commonly used service for tracking service
interruptions. Additional policy definitions may be beneficial for
services that do publish their IP addresses in use. The list is
additionally aggregated to slightly shorten it.

Signed-off-by: Marcel Bischoff <marcel@herrbischoff.com>
2025-07-16 09:17:48 -04:00
Xe Iaso
735b2ceb14
fix(default-config): disable system load check by default (#827)
This was causing issues with git clone against highly loaded servers. I
thought that this would be pretty innocuous, but I guess I was wrong.
Oops!

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-07-14 13:06:56 +00:00
Xe Iaso
4ea0add50d
feat(lib/policy/expressions): add system load average to bot expression inputs (#766)
* feat(lib/policy/expressions): add system load average to bot expression inputs

This lets Anubis dynamically react to system load in order to
increase and decrease the required level of scrutiny. High load? More
scrutiny required. Low load? Less scrutiny required.

* docs: spell system correctly

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for Xe/load-average

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* fix(default-config): don't enable low load average feature by default

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2025-07-06 20:13:50 +00:00
Xe Iaso
dff2176beb
feat(lib): use new challenge creation flow (#749)
* feat(decaymap): add Delete method

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(lib/challenge): refactor Validate to take ValidateInput

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib): implement store interface

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): all metapackage to import all store implementations

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(policy): import all store backends

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib): use new challenge creation flow

Previously Anubis constructed challenge strings from request metadata.
This was a good idea in spirit, but has turned out to be a very bad idea
in practice. This new flow reuses the Store facility to dynamically
create challenge values with completely random data.

This is a fairly big rewrite of how Anubis processes challenges. Right
now it defaults to using the in-memory storage backend, but on-disk
(boltdb) and valkey-based adaptors will come soon.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(decaymap): fix documentation typo

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(lib): fix SA4004

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/store): make generic storage interface test adaptor

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(decaymap): invert locking process for Delete

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): add bbolt store implementation

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: go mod tidy

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(devcontainer): adapt to docker compose, add valkey service

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): make challenges live for 30 minutes by default

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/store): implement valkey backend

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/store/valkey): disable tests if not using docker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib/policy/config): ensure valkey stores can be loaded

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for Xe/store-interface

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* chore(devcontainer): remove port forwards because vs code handles that for you

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(default-config): add a nudge to the storage backends section of the docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(docs): listen on 0.0.0.0 for dev container support

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(policy): document storage backends

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG and internal links

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(admin/policies): don't start a sentence with as

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: fixes found in review

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
2025-07-04 20:42:28 +00:00
Xe Iaso
7c0996448a
chore(default-config): allowlist common crawl (#753)
This may seem strange, but allowlisting common crawl means that scrapers
have less incentive to scrape because they can just grab the data from
common crawl instead of scraping it again.
2025-07-04 00:10:45 +00:00
Martin
6aa17532da
fix: Dynamic cookie domain not working (#731)
* Fix cookieDynamicDomain option not being set in Options struct

* Fix using wrong cookie name when using dynamic cookie domains

* Adjust testcases for new cookie option structs

* Add known words to expect.txt and change typo in Zombocom

* Cleanup expect.txt

* Add changes to changelog

* Bump versions of grpc and apimachinery

* Fix testcases and add additional condition for dynamic cookie domain
2025-06-29 15:38:55 -04:00
Xe Iaso
4c74934e9f
fix(default-config): Techaro -> Zombocom
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-22 20:04:40 -04:00
Xe Iaso
5870f7072c
feat: implement imprint/impressum support (#706)
* feat: implement imprint/impressum support

Closes #362

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(docs/anubis): enable an imprint

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: fix the end of the sentence, comment out a default impressum

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: link back to impressum page

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-22 18:09:37 -04:00
Xe Iaso
3c1d95d61e
fix(default-config): off-by-one error in the default thresholds (#701)
I don't know how I missed this in testing.
2025-06-20 11:47:34 -04:00
Xe Iaso
4948036f39
feat: add default OpenGraph tags to configuration file (#694)
* feat(config): opengraph passthrough configuration

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(ogtags): use config.OpenGraph for configuration

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: wire up ogtags config in most of the app

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(ogtags): return default tags if they are supplied

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: make OpenGraph legal so we have some sanity in reviewing

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): use OpenGraph.Enabled

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib): load default config file if one is not specified in spawnAnubis

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(config): fix ST1005

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: document open graph defaults and its new home in the policy file

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(installation): point to weight threshold new home

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: rename default to override

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(default-config): add off-by-default opengraph settings to bot policy file

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(anubis): make build

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib): fix build

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-19 18:00:44 -04:00
Xe Iaso
226cf36bf7
feat(config): custom weight thresholds via CEL (#688)
* feat(config): add Thresholds to the top level config file

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(config): make String() on ExpressionOrList join the component expressions

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(config): ensure unparseable json fails

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(config): if no thresholds are set, use the default thresholds

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(policy): half implement thresholds

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(policy): continue wiring things up

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib): wire up thresholds

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib): handle behavior from legacy configurations

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: document thresholds

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG, refer to threshold configuration

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): fix build

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(lib): fix U1000

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Co-authored-by: Jason Cameron <git@jasoncameron.dev>
2025-06-18 16:58:31 -04:00
Dryusdan
1d5fa49eb0
Bump ai.robots.txt to v1.37 (#689)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
2025-06-18 13:30:53 -04:00
hydrargyrum
244f1c505a
fix(geo): correct typo "counties" to "countries" (#678) 2025-06-17 23:50:42 -04:00
Xe Iaso
e3826df3ab
feat: implement a client for Thoth, the IP reputation database for Anubis (#637)
* feat(internal): add Thoth client and simple ASN checker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thoth): cached ip to asn checker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: go mod tidy

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(thoth): minor testing fixups, ensure ASNChecker is Checker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thoth): make ASNChecker instances

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thoth): add GeoIP checker

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thoth): store a thoth client in a context

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: refactor Checker type to its own package

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(thoth): add thoth mocking package, ignore context deadline exceeded errors

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thoth): pre-cache private ranges

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(lib/policy/config): enable thoth ASNs and GeoIP checker parsing

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(thoth): refactor to move checker creation to the checker files

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(policy): enable thoth checks

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(thothmock): test helper function for loading a mock thoth instance

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat: wire up Thoth, make thoth checks part of the default config

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(thoth): mend staticcheck errors

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(admin): add Thoth docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(policy): update Thoth links in error messages

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(docs/manifest): enable Thoth

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: add THOTH_INSECURE for contacting Thoth over plain TCP in extreme circumstances

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(thoth): use mock thoth when credentials aren't detected in the environment

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(cmd/anubis): better warnings for half-configured Thoth setups

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(botpolicies): link to Thoth geoip docs

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-16 11:57:32 -04:00
Xe Iaso
c638653172
feat(lib): implement request weight (#621)
* feat(lib): implement request weight

Replaces #608

This is a big one and will be what makes Anubis a generic web
application firewall. This introduces the WEIGH option, allowing
administrators to have facets of request metadata add or remove
"weight", or the level of suspicion. This really makes Anubis weigh
the soul of requests.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): maintain legacy challenge behavior

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib): make weight have dedicated checkers for the hashes

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(data): convert some rules over to weight points

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: document request weight

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(CHANGELOG): spelling error

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: fix links to challenge information

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(policies): fix formatting

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(config): make default weight adjustment 5

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-09 15:25:04 -04:00
Dryusdan
281b6c5c00
Bump ai.robots.txt to v1.34 (#632) 2025-06-08 14:54:47 -04:00
Xe Iaso
51c384eefd
fix(data/bots): bring back ai-robots-txt.yaml
Closes #599

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-01 17:15:00 -04:00
Corry Haines
de7dbfe6d6
Split up AI filtering files (#592)
* Split up AI filtering files

Create aggressive/moderate/permissive policies to allow administrators to choose their AI/LLM stance.

Aggressive policy matches existing default in Anubis.

Removes `Google-Extended` flag from `ai-robots-txt.yaml` as it doesn't exist in requests.

Rename `ai-robots-txt.yaml` to `ai-catchall.yaml` as the file is no longer a copy of the source repo/file.

* chore: spelling

* chore: fix embeds

* chore: fix data includes

* chore: fix file name typo

* chore: Ignore READMEs in configs

* chore(lib/policy/config): go tool goimports -w

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-06-01 20:21:18 +00:00
Corry Haines
0d9ebebff6
Opt-in policies for OpenAI and MistralAI bots (#590)
* Define OpenAI bot ALLOW policies

Allows OpenAI bots to be allowlisted at the choice of the Anubis administrator. None are enabled by default.

* Define MistralAI bot ALLOW policy

* chore: spelling
2025-05-31 16:48:57 -04:00
Corry Haines
68a71c6a99
Add Applebot definition (#589)
* Add Applebot definition

Adds Apple's search indexing bot, and allowlists it by default.

Allowlisted by default because it is equivalent to Googlebot/Bingbot. Remove Applebot from `ai-robots-txt.yaml` for the same reasons.

Remove `Applebot-Extended` from `ai-robots-txt.yaml` as it has no effect.

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-31 10:18:32 -04:00
Xe Iaso
cd8a7eb2e2
feat(data): add x-firefox-ai default challenge rule (#580)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-28 21:08:39 +00:00
Xe Iaso
22c47f40d1
feat(expressions): add randInt function to allow making rules nondeterministic (#578)
This seems counter-intuitive at first glance, but let me cook.

One of the problems with Anubis is that the rule matching is super
deterministic. This means that attackers can figure out what patterns
they are hitting and change things to bypass them.

The randInt function lets you have rulesets behave nondeterministically.
This is a very easy way to hang yourself, but can be great to
psychologically mess with scraper operators. Consider this rule:

```yaml
- name: deny-lightpanda-sometimes
  action: DENY
  expression:
    all:
      - userAgent.matches("LightPanda")
      - randInt(16) >= 4
```

It would match about 75% of the time.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-28 16:36:27 -04:00
Dryusdan
11081aac08
Bump AI-robots.txt rules to version 1.31 (#538)
* Bump AI-robots.txt rules to version 1.31

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-23 16:15:12 +00:00
Dryusdan
9e9982ab5d
feat(apps): Make SASL login work on bookstack with Anubis (#502)
* Make SASL login work on bookstack with Anubis

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-16 17:01:34 +00:00