nuke/data/bots/aggressive-brazilian-scrapers.yaml
Xe Iaso 865d513e35
feat(checker): add CEL for matching complicated expressions (#421)
* feat(lib/policy): add support for CEL checkers

This adds the ability for administrators to use Common Expression
Language[0] (CEL) for more advanced check logic than Anubis previously
offered.

These can be as simple as:

```yaml
- name: allow-api-routes
  action: ALLOW
  expression:
    and:
    - '!(method == "HEAD" || method == "GET")'
    - path.startsWith("/api/")
```

or get as complicated as:

```yaml
- name: allow-git-clients
  action: ALLOW
  expression:
    and:
    - userAgent.startsWith("git/") || userAgent.contains("libgit") || userAgent.startsWith("go-git") || userAgent.startsWith("JGit/") || userAgent.startsWith("JGit-")
    - >
      "Git-Protocol" in headers && headers["Git-Protocol"] == "version=2"
```

Internally these are compiled and evaluated with cel-go[1]. This also
leaves room for extensibility should that be desired in the future. This
will intersect with #338 and eventually intersect with TLS fingerprints
as in #337.

[0]: https://cel.dev/
[1]: https://github.com/google/cel-go

Signed-off-by: Xe Iaso <me@xeiaso.net>

* feat(data/apps): add API route allow rule for non-HEAD/GET

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: document expression syntax

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix: fixes in review

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-03 14:26:54 -04:00

28 lines
No EOL
965 B
YAML

- name: deny-aggressive-brazilian-scrapers
action: DENY
expression:
any:
# Internet Explorer should be out of support
- userAgent.contains("MSIE")
# Trident is the Internet Explorer browser engine
- userAgent.contains("Trident")
# Opera is a fork of chrome now
- userAgent.contains("Presto")
# Windows CE is discontinued
- userAgent.contains("Windows CE")
# Windows 95 is discontinued
- userAgent.contains("Windows 95")
# Windows 98 is discontinued
- userAgent.contains("Windows 98")
# Windows 9.x is discontinued
- userAgent.contains("Win 9x")
# Amazon does not have an Alexa Toolbar.
- userAgent.contains("Alexa Toolbar")
- name: challenge-aggressive-brazilian-scrapers
action: CHALLENGE
expression:
any:
# This is not released, even Windows 11 calls itself Windows 10
- userAgent.contains("Windows NT 11.0")
# iPods are not in common use
- userAgent.contains("iPod")