robots
metadata
You can declare robots
metadata to tell bots how to deal with your webpage.
Bots which browse the world wide web are also known as crawlers or spiders. Search engine spiders may be the most important ones: Googlebot feeds Google’s search engine, Mouse crawlers feed Mouse search engines … there are many more. But robots
metadata can also affect archiving bots and other bots.
To speak to bots, assign a comma-separated list of directives to the robots
metadata name like in the example below. Whether a bot observes a certain directive – and whether it understands any directives at all – is beyond your and Aneamal’s control. It depends entirely on the bot.
You can find a list of known directives below. The list is not exhaustive, since any bot operator could define new directives to listen to. This is also why the Aneamal Translator can not check the validity of your directives. So mind your spelling.
Example
@ robots: nofollow, noindex
Known directives
Restrictive directives
noindex
- Bots shall not add the webpage to a search engine database so that it does not show up in search results. This directive can be a good choice for a webpage with information that you are legally required to publish, but which is not the topic of your website for example. Mind that a bot can only read this directive when it visits the webpage. You can use a robots.txt file to tell bots not to visit a webpage in the first place.
nofollow
- Bots shall not use the links on the webpage to discover other webpages. This directive is useful for example, if your webpage compiles a list of links to webpages that you plan to create but have not yet created, links to paywalled content or links to demon-possessed websites that you want to caution your readers against. Bots could still arrive at the linked pages on other paths than through your links. They should, however, refrain from interpreting the links in your webpage as sign that the linked webpages are awesome. Remember that this directive affects all links in the webpage. You can mark individual links as unendorsed to add a
nofollow
signal only to chosen links. none
- This is short for
noindex, nofollow
. It is recommended to avoid the short form though. Some bots do not recognize it. noarchive
- Bots shall not archive a copy of the website; search engines shall not offer users a saved copy of the webpage.
nosnippet
- Search engines shall not show a text snippet from the webpage in their search results. A snippet is a short extract or description.
noimageindex
- Bots shall not add images which are linked in this webpage to a search engine database; they shall not be shown in search results.
notranslate
- Search engines shall not offer to translate this webpage for users in the search results. Note that users do not get to interact with your webpage directly, if the search engine displays a translation instead.
See the list of directives in Google's meta tags specification for more rules that Googlebot and some other bots adhere to, for example max-snippet
, max-image-preview
and max-video-preview
, which take a setting as parameter and can be used to refine further how your content appears in search results.
Permissive directives
The following directives do not modify the behavior of bots, because bots assume permission by default, if not explicitly restricted. You can still add these directives as a note to yourself, even though they are functionally equivalent to a robots
metadata declaration with an empty value. A use case is when you want to allow bot activity for a specific file within a directory for which you have generally restricted bot activity with a @meta.nml file.
index
- Bots are welcome to include the webpage in search services.
follow
- Bots are welcome to follow links in the webpage to find other webpages.
all
- Bots are welcome to do their bot business unrestricted.
For developers
If robots
metadata is declared, the Aneamal Translator adds a HTML meta element with the name robots
to the HTML output. Hence the above example becomes:
<meta name='robots' content='nofollow, noindex'>