No description
Find a file
Jason Cameron e0781e4560
feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657)
* feat: add robots2policy CLI utility to convert robots.txt to Anubis challenge policies

* feat: add documentation for robots2policy CLI tool

* feat: implement crawl delay handling as weight adjustment in Anubis rules

* feat: add various robots.txt and YAML configurations for user agent handling and crawl delays

* test: add comprehensive tests for robots2policy conversion and parsing

* fix: update example URL in usage instructions for robots2policy CLI

* Update metadata

check-spelling run (pull_request) for json/robots2policycli

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* docs: add crawl delay weight adjustment and deny user agents option to robots2policy CLI

* Update cmd/robots2policy/main.go

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* Update cmd/robots2policy/main.go

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>

* fix(robots2policy): use sigs.k8s.io/yaml

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(config): properly marshal bot policy rules

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(yeetfile): expose robots2policy in libexec

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(yeetfile): put robots2policy in $PATH

Signed-off-by: Xe Iaso <me@xeiaso.net>

* Update metadata

check-spelling run (pull_request) for json/robots2policycli

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* style: reorder imports

* refactor: use preexisting structs in config

* fix: correct flag check in main function

* fix: reorder fields in AnubisRule struct for better alignment

* style: improve alignment of struct fields in AnubisRule and OGTagCache

* Update metadata

check-spelling run (pull_request) for json/robots2policycli

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>

* fix: add validation for generated Anubis rules from robots.txt

* feat: add batch processing for robots.txt files to generate Anubis CEL policies

* fix: improve usage message and error handling for input file requirement

* refactor: update AnubisRule structure to use ExpressionOrList for improved expression handling

* refactor: reorganize policy definitions in YAML files for consistency and clarity

* fix: correct indentation in blacklist and complex YAML files for consistency

* test: enhance output comparison in robots2policy tests for YAML and JSON formats

* Revert "fix: improve usage message and error handling for input file requirement"

This reverts commit ddcde1f2a326545d3ef2ec32e5e03f55f4f931a8.

* fix: improve usage message and error handling in robots2policy

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

---------

Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-06-14 23:41:00 -04:00
.github feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
.vscode feat(checker): add CEL for matching complicated expressions (#421) 2025-05-03 14:26:54 -04:00
cmd feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
data feat(lib): implement request weight (#621) 2025-06-09 15:25:04 -04:00
decaymap Add periodic cleanup job for DecayMap (#8) (#158) 2025-03-29 23:24:06 -04:00
docs feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
internal feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
lib feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
run Create Anubis OpenRC init.d script (#561) 2025-05-27 01:58:59 +00:00
test test: introduce SSH based CI for non-native test hosts (#644) 2025-06-11 12:50:01 -04:00
var initial import from /x/ monorepo 2025-03-17 19:33:07 -04:00
web Make progress bar styling more compatible (UXP, etc) (#636) 2025-06-09 12:19:38 -04:00
xess chore: go generate 2025-06-08 20:52:22 -04:00
.air.toml feat: add a strip-base-prefix option (#655) 2025-06-12 17:46:08 -04:00
.gitattributes fix(gitattributes): update pattern for generated files (#652) 2025-06-11 21:00:37 +00:00
.gitignore docs: fix edit me links and configuration subcategory (#238) 2025-04-07 17:28:29 -04:00
.ko.yaml Try using ko to build images 2025-03-19 09:10:29 -04:00
anubis.go feat(lib): ensure that clients store cookies (#501) 2025-05-16 13:03:40 -04:00
Brewfile all: do not commit generated JS/CSS to source control (#148) 2025-03-28 14:55:25 -04:00
go.mod feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00
go.sum build(deps): bump github.com/cloudflare/circl from 1.6.0 to 1.6.1 (#650) 2025-06-11 16:52:10 +00:00
LICENSE initial import from /x/ monorepo 2025-03-17 19:33:07 -04:00
Makefile ci: add govulncheck (#456) 2025-05-06 14:07:55 +00:00
package-lock.json build(deps-dev): bump esbuild from 0.25.4 to 0.25.5 in the npm group (#600) 2025-06-01 23:38:45 -04:00
package.json build(deps-dev): bump esbuild from 0.25.4 to 0.25.5 in the npm group (#600) 2025-06-01 23:38:45 -04:00
README.md feat(lib/challenge): HTTP meta refresh challenge method (#623) 2025-06-06 21:18:55 -04:00
VERSION v1.19.1 2025-06-01 17:17:24 -04:00
yeetfile.js feat: add robots2policy CLI to convert robots.txt to Anubis CEL (#657) 2025-06-14 23:41:00 -04:00

Anubis

A smiling chibi dark-skinned anthro jackal with brown hair and tall ears looking victorious with a thumbs-up

enbyware GitHub Issues or Pull Requests by label GitHub go.mod Go version language count repo size

Sponsors

Anubis is brought to you by sponsors and donors like:

Diamond Tier

Raptor Computing Systems

Gold Tier

Distrust Terminal Trove canine.tools Weblate Uberspace Wildbase

Overview

Anubis is a Web AI Firewall Utility that weighs the soul of your connection using one or more challenges in order to protect upstream resources from scraper bots.

This program is designed to help protect the small internet from the endless storm of requests that flood in from AI companies. Anubis is as lightweight as possible to ensure that everyone can afford to protect the communities closest to them.

Anubis is a bit of a nuclear response. This will result in your website being blocked from smaller scrapers and may inhibit "good bots" like the Internet Archive. You can configure bot policy definitions to explicitly allowlist them and we are working on a curated set of "known good" bots to allow for a compromise between discoverability and uptime.

In most cases, you should not need this and can probably get by using Cloudflare to protect a given origin. However, for circumstances where you can't or won't use Cloudflare, Anubis is there for you.

If you want to try this out, connect to anubis.techaro.lol.

Support

If you run into any issues running Anubis, please open an issue. Please include all the information I would need to diagnose your issue.

For live chat, please join the Patreon and ask in the Patron discord in the channel #anubis.

Star History

Star History Chart

Packaging Status

Packaging status

Contributors

Made with contrib.rocks.