feat(expressions): add segments function to break path into segments (#916)

Signed-off-by: Xe Iaso <me@xeiaso.net>
This commit is contained in:
Xe Iaso 2025-07-25 16:21:08 -04:00 committed by GitHub
parent bf42014ac3
commit a735770c93
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 266 additions and 80 deletions

View file

@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
<!-- This changes the project to: -->
- The [Thoth client](https://anubis.techaro.lol/docs/admin/thoth) is now public in the repo instead of being an internal package.
- The [`segments`](./admin/configuration/expressions.mdx#segments) function was added for splitting a path into its slash-separated segments.
## v1.21.3: Minfilia Warde - Echo 3

View file

@ -232,6 +232,39 @@ This is best applied when doing explicit block rules, eg:
It seems counter-intuitive to allow known bad clients through sometimes, but this allows you to confuse attackers by making Anubis' behavior random. Adjust the thresholds and numbers as facts and circumstances demand.
### `segments`
Available in `bot` expressions.
```ts
function segments(path: string): string[];
```
`segments` returns the number of slash-separated path segments, ignoring the leading slash. Here is what it will return with some common paths:
| Input | Output |
| :----------------------- | :--------------------- |
| `segments("/")` | `[""]` |
| `segments("/foo/bar")` | `["foo", "bar"] ` |
| `segments("/users/xe/")` | `["users", "xe", ""] ` |
:::note
If the path ends with a `/`, then the last element of the result will be an empty string. This is because `/users/xe` and `/users/xe/` are semantically different paths.
:::
This is useful if you want to write rules that allow requests that have no query parameters only if they have less than two path segments:
```yaml
- name: two-path-segments-no-query
action: ALLOW
expression:
all:
- size(query) == 0
- size(segments(path)) < 2
```
## Life advice
Expressions are very powerful. This is a benefit and a burden. If you are not careful with your expression targeting, you will be liable to get yourself into trouble. If you are at all in doubt, throw a `CHALLENGE` over a `DENY`. Legitimate users can easily work around a `CHALLENGE` result with a [proof of work challenge](../../design/why-proof-of-work.mdx). Bots are less likely to be able to do this.