feat: implement a client for Thoth, the IP reputation database for Anubis (#637)
* feat(internal): add Thoth client and simple ASN checker Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thoth): cached ip to asn checker Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: go mod tidy Signed-off-by: Xe Iaso <me@xeiaso.net> * fix(thoth): minor testing fixups, ensure ASNChecker is Checker Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thoth): make ASNChecker instances Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thoth): add GeoIP checker Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thoth): store a thoth client in a context Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: refactor Checker type to its own package Signed-off-by: Xe Iaso <me@xeiaso.net> * test(thoth): add thoth mocking package, ignore context deadline exceeded errors Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thoth): pre-cache private ranges Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(lib/policy/config): enable thoth ASNs and GeoIP checker parsing Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(thoth): refactor to move checker creation to the checker files Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(policy): enable thoth checks Signed-off-by: Xe Iaso <me@xeiaso.net> * feat(thothmock): test helper function for loading a mock thoth instance Signed-off-by: Xe Iaso <me@xeiaso.net> * feat: wire up Thoth, make thoth checks part of the default config Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: spelling Signed-off-by: Xe Iaso <me@xeiaso.net> * fix(thoth): mend staticcheck errors Signed-off-by: Xe Iaso <me@xeiaso.net> * docs(admin): add Thoth docs Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(policy): update Thoth links in error messages Signed-off-by: Xe Iaso <me@xeiaso.net> * docs: update CHANGELOG Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: spelling Signed-off-by: Xe Iaso <me@xeiaso.net> * chore(docs/manifest): enable Thoth Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: add THOTH_INSECURE for contacting Thoth over plain TCP in extreme circumstances Signed-off-by: Xe Iaso <me@xeiaso.net> * test(thoth): use mock thoth when credentials aren't detected in the environment Signed-off-by: Xe Iaso <me@xeiaso.net> * chore: spelling Signed-off-by: Xe Iaso <me@xeiaso.net> * fix(cmd/anubis): better warnings for half-configured Thoth setups Signed-off-by: Xe Iaso <me@xeiaso.net> * docs(botpolicies): link to Thoth geoip docs Signed-off-by: Xe Iaso <me@xeiaso.net> --------- Signed-off-by: Xe Iaso <me@xeiaso.net>
This commit is contained in:
parent
823d1be5d1
commit
e3826df3ab
39 changed files with 1101 additions and 82 deletions
|
|
@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||
- Optimized the OGTags subsystem with reduced allocations and runtime per request by up to 66%
|
||||
- Add `--strip-base-prefix` flag/envvar to strip the base prefix from request paths when forwarding to target servers
|
||||
- Add `robots2policy` CLI utility to convert robots.txt files to Anubis challenge policies using CEL expressions ([#409](https://github.com/TecharoHQ/anubis/issues/409))
|
||||
- Implement GeoIP and ASN based checks via [Thoth](https://anubis.techaro.lol/docs/admin/thoth) ([#206](https://github.com/TecharoHQ/anubis/issues/206))
|
||||
|
||||
## v1.19.1: Jenomis cen Lexentale - Echo 1
|
||||
|
||||
|
|
|
|||
81
docs/docs/admin/thoth.mdx
Normal file
81
docs/docs/admin/thoth.mdx
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# Thoth-based advanced checks
|
||||
|
||||
Status: Beta
|
||||
|
||||
Anubis instances are normally isolated. Each Anubis instance has its own configuration and exists in roughly its own world without any long term memory between requests. As threats, workarounds, and AI scraper toolchains evolve, administrators will need a way to get more up to date information faster than Anubis' release cycle.
|
||||
|
||||
Thus, Thoth is being created. Thoth is the reputation database for Anubis. Thoth feeds information to Anubis so that it can make better decisions about which traffic is innocuous and which traffic is suspicious.
|
||||
|
||||
:::note
|
||||
|
||||
Thoth is hosted by [Techaro](https://techaro.lol). Thoth is a paid service. Thoth is opt-in and requires manual intervention (including payment) to use. The code that powers Thoth is currently closed source.
|
||||
|
||||
To get access to Thoth, please subscribe [on GitHub Sponsors](https://github.com/sponsors/Xe) and [email Xe](mailto:xe@techaro.lol). This will be self-service soon.
|
||||
|
||||
:::
|
||||
|
||||
## Implementation
|
||||
|
||||
Thoth is a web service that listens over [gRPC](https://grpc.io/). Thoth's API is documented in protocol buffer definitions in the GitHub repo [TecharoHQ/thoth-proto](https://github.com/TecharoHQ/thoth-proto).
|
||||
|
||||
Thoth is designed to be _informative_, not _authoritative_. Thoth cannot and will not arbitrarily block requests, origins, or other traffic. Thoth is there to inform Anubis and influence the weight of requests so that upstream resources can be protected. Additionally, Anubis aggressively caches data from Thoth such that over time Anubis will not need to request data very often. This makes the fast path for repeat visitors even faster and reduces the amount of data that Thoth is exposed to.
|
||||
|
||||
## Thoth features
|
||||
|
||||
Thoth is currently in active development. Currently, Thoth provides the following features to Anubis:
|
||||
|
||||
- BGP Autonomous System (ASN) based filtering
|
||||
- GeoIP location based filtering
|
||||
|
||||
### ASN-based filtering
|
||||
|
||||
When companies link their backbone infrastructure to the Internet, they do so via a [BGP Autonomous System](<https://en.wikipedia.org/wiki/Autonomous_system_(Internet)>), denoted by a number (the Autonomous System Number or ASN). Every IP address on the Internet is owned by an ASN with a 1:1 lookup that does not change very frequently.
|
||||
|
||||
Anubis uses Thoth to match IP addresses to BGP Autonomous Systems so that you can either issue arbitrary challenges to individual internet service providers (such as Cloudflare or Huawei Cloud) or, at the administrator's explicit instruction, block them altogether. For example, here's how you add 10 weight points to requests from Cloudflare, Huawei Cloud, and Alibaba Cloud:
|
||||
|
||||
```yaml
|
||||
- name: aggressive-asns-without-functional-abuse-contact
|
||||
action: WEIGH
|
||||
asns:
|
||||
match:
|
||||
- 13335 # Cloudflare
|
||||
- 136907 # Huawei Cloud
|
||||
- 45102 # Alibaba Cloud
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
You can look up details for [AS13335](https://bgp.tools/as/13335) or any of these other top offenders on [bgp.tools](https://bgp.tools).
|
||||
|
||||
### GeoIP-based filtering
|
||||
|
||||
In extreme cases, an administrator may have to take action against an entire country. This is not an ideal circumstance, but sometimes reality forces their hands and the administrators just want to sleep at night.
|
||||
|
||||
Anubis uses Thoth to look up the geographic location registered to an IP address. This lookup is not the best and will get better with time, but you ship what you can so you can make it better for next time.
|
||||
|
||||
For example, to add 10 weight points to requests from Brazil and China:
|
||||
|
||||
```yaml
|
||||
- name: countries-with-aggressive-scrapers
|
||||
action: WEIGH
|
||||
geoip:
|
||||
counties:
|
||||
- BR
|
||||
- CN
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
Use this with care.
|
||||
|
||||
## Work-in-progress features
|
||||
|
||||
This section is a bit aspirational and is where Thoth will end up rather than things you can use today.
|
||||
|
||||
In general, a lot of Thoth features are focused on taking the same Anubis you know and love and making it better, smarter, and less paranoid. These include:
|
||||
|
||||
- Private rulesets for advanced patterns, current known exploits, and other recognition tactics that need to be kept cloak and dagger for operational security reasons
|
||||
- Private challenge implementations via WebAssembly, including advanced browser detection logic
|
||||
- Reputation querying so that Thoth can arbitrarily influence the weight of requests based on the net aggregate pass rate so that the most common browsers can get through with no challenge issued at all
|
||||
- APIs for trusted administrators to report abusive request fingerprints so that Anubis can react to threats as they evolve
|
||||
- A way for Anubis to periodically report the pass rate per ASN and other fingerprints so that methodology can be improved
|
||||
6
docs/manifest/1password.yaml
Normal file
6
docs/manifest/1password.yaml
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
apiVersion: onepassword.com/v1
|
||||
kind: OnePasswordItem
|
||||
metadata:
|
||||
name: anubis-docs-thoth
|
||||
spec:
|
||||
itemPath: "vaults/lc5zo4zjz3if3mkeuhufjmgmui/items/pwguumqcmtxvqbeb7y4gj7l36i"
|
||||
|
|
@ -68,3 +68,6 @@ spec:
|
|||
- ALL
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
envFrom:
|
||||
- secretRef:
|
||||
name: anubis-docs-thoth
|
||||
|
|
|
|||
|
|
@ -1,7 +1,9 @@
|
|||
resources:
|
||||
- 1password.yaml
|
||||
- deployment.yaml
|
||||
- ingress.yaml
|
||||
- onionservice.yaml
|
||||
- poddisruptionbudget.yaml
|
||||
- service.yaml
|
||||
|
||||
configMapGenerator:
|
||||
|
|
|
|||
9
docs/manifest/poddisruptionbudget.yaml
Normal file
9
docs/manifest/poddisruptionbudget.yaml
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
apiVersion: policy/v1
|
||||
kind: PodDisruptionBudget
|
||||
metadata:
|
||||
name: anubis-docs
|
||||
spec:
|
||||
minAvailable: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: anubis-docs
|
||||
Loading…
Add table
Add a link
Reference in a new issue