Perplexity has some proper documentation available for their crawlers, with published IP addresses: https://docs.perplexity.ai/guides/bots. Signed-off-by: Timon de Groot <timon.degroot@team.blue>
9 lines
497 B
YAML
9 lines
497 B
YAML
# Blocks all AI/LLM bots used for training or unknown/undocumented purposes.
|
|
# Permits user agents with explicitly documented non-training use, and published IP allowlists.
|
|
- import: (data)/bots/ai-catchall.yaml
|
|
- import: (data)/crawlers/ai-training.yaml
|
|
- import: (data)/crawlers/openai-searchbot.yaml
|
|
- import: (data)/crawlers/perplexitybot.yaml
|
|
- import: (data)/clients/openai-chatgpt-user.yaml
|
|
- import: (data)/clients/mistral-mistralai-user.yaml
|
|
- import: (data)/clients/perplexity-user.yaml
|