Robots API
Controls robots.txt generation and meta robots directives for search engine crawlers.
Since: 2.1.0 (robots.txt), 5.7.0 (meta robots)
Source: wp-includes/robots-template.php, wp-includes/functions.php, wp-includes/query.php
Components
| Component | Description |
|---|---|
| functions.md | Core robots functions |
| hooks.md | Actions and filters |
Two Systems
WordPress has two separate robots mechanisms:
1. robots.txt File
Virtual file served at /robots.txt. Generated dynamically via do_robots().
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
2. Meta Robots Tag
HTML meta tag in <head> controlling per-page indexing. Generated via wp_robots().
<meta name='robots' content='noindex, nofollow' />
Default Behavior
robots.txt
- Always disallows
/wp-admin/ - Always allows
/wp-admin/admin-ajax.php - Customizable via
robots_txtfilter
Meta Robots (Default Filters)
| Filter Callback | Condition | Directives Added |
|---|---|---|
wp_robots_noindex |
Site not public | noindex, nofollow |
wp_robots_noindex_embeds |
Embed requests | noindex, nofollow |
wp_robots_noindex_search |
Search results | noindex, nofollow |
wp_robots_max_image_preview_large |
Site is public | max-image-preview:large |
Request Flow
Request: /robots.txt
└── is_robots() returns true
└── template-loader.php fires 'do_robots' action
└── do_robots() outputs robots.txt content
Request: Any page
└── wp_head action
└── wp_robots()
└── apply_filters('wp_robots', [])
└── Each callback adds directives
└── Output: <meta name='robots' content='...' />
Directive Format
The wp_robots filter receives/returns an associative array:
// Boolean directive (no value)
$robots['noindex'] = true; // Outputs: noindex
// String directive (with value)
$robots['max-image-preview'] = 'large'; // Outputs: max-image-preview:large