## Anubis has the ability to let you import snippets of configuration into the main ## configuration file. This allows you to break up your config into smaller parts ## that get logically assembled into one big file. ## ## Of note, a bot rule can either have inline bot configuration or import a ## bot config snippet. You cannot do both in a single bot rule. ## ## Import paths can either be prefixed with (data) to import from the common/shared ## rules in the data folder in the Anubis source tree or will point to absolute/relative ## paths in your filesystem. If you don't have access to the Anubis source tree, check ## /usr/share/docs/anubis/data or in the tarball you extracted Anubis from. bots: - import: (data)/crawlers/commoncrawl.yaml # Pathological bots to deny - # This correlates to data/bots/deny-pathological.yaml in the source tree # https://github.com/TecharoHQ/anubis/blob/main/data/bots/deny-pathological.yaml import: (data)/bots/_deny-pathological.yaml - import: (data)/bots/aggressive-brazilian-scrapers.yaml # Aggressively block AI/LLM related bots/agents by default - import: (data)/meta/ai-block-aggressive.yaml # Consider replacing the aggressive AI policy with more selective policies: # - import: (data)/meta/ai-block-moderate.yaml # - import: (data)/meta/ai-block-permissive.yaml # Search engine crawlers to allow, defaults to: # - Google (so they don't try to bypass Anubis) # - Apple # - Bing # - DuckDuckGo # - Qwant # - The Internet Archive # - Kagi # - Marginalia # - Mojeek - import: (data)/crawlers/_allow-good.yaml # Challenge Firefox AI previews - import: (data)/clients/x-firefox-ai.yaml # Allow common "keeping the internet working" routes (well-known, favicon, robots.txt) - import: (data)/common/keep-internet-working.yaml # # Punish any bot with "bot" in the user-agent string # # This is known to have a high false-positive rate, use at your own risk # - name: generic-bot-catchall # user_agent_regex: (?i:bot|crawler) # action: CHALLENGE # challenge: # difficulty: 16 # impossible # report_as: 4 # lie to the operator # algorithm: slow # intentionally waste CPU cycles and time - name: rss-feed-blog action: ALLOW expression: any: - path.startsWith("/blog/atom.") - path.startsWith("/blog/rss.") # Generic catchall rule - name: generic-browser user_agent_regex: >- Mozilla|Opera action: CHALLENGE challenge: difficulty: 1 # Number of seconds to wait before refreshing the page report_as: 4 # Unused by this challenge method algorithm: metarefresh # Specify a non-JS challenge method dnsbl: false impressum: footer: | This website is hosted by Techaro. If you have any complaints or notes about the service, please contact contact@techaro.lol and we will assist you as soon as possible. page: title: Privacy Policy body: |

Last updated: June 2025

Information that is gathered from visitors

In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.

Cookies may be used to remember visitor preferences when interacting with the website.

Where registration is required, the visitor's email and a username will be stored on the server.

How the Information is used

The information is used to enhance the vistor's experience when using the website to display personalised content and possibly advertising.

E-mail addresses will not be sold, rented or leased to 3rd parties.

E-mail may be sent to inform you of news of our services or offers by us or our affiliates.

Visitor Options

If you have subscribed to one of our services, you may unsubscribe by following the instructions which are included in e-mail that you receive.

You may be able to block cookies via your browser settings but this may prevent you from access to certain features of the website.

Cookies

Cookies are small digital signature files that are stored by your web browser that allow your preferences to be recorded when visiting the website. Also they may be used to track your return visits to the website.

3rd party advertising companies may also use cookies for tracking purposes.

Techaro Anubis

This website uses a service called Anubis to filter malicious traffic. Anubis requires the use of browser cookies to ensure that web clients are running conformant software. Anubis also may report the following data to Techaro to improve service quality:

This data is processed and stored for the legitimate interest of combatting abusive web clients. This data is encrypted at rest as much as possible and is only decrypted in memory for the purposes of fulfilling requests.

# By default, send HTTP 200 back to clients that either get issued a challenge # or a denial. This seems weird, but this is load-bearing due to the fact that # the most aggressive scraper bots seem to really, really, want an HTTP 200 and # will stop sending requests once they get it. status_codes: CHALLENGE: 200 DENY: 200 store: backend: bbolt parameters: path: /xe/data/anubis/data.bdb