AI Content Signals
AI Content Signals lets you declare, in a machine-readable way, how AI systems may use your content: for search indexing, real-time AI answers (RAG), or model training. It started with Cloudflare’s Content Signals in robots.txt and now expresses the same preferences across several surfaces, so more crawlers and tools can read them:
- robots.txt — Cloudflare Content Signals (search / ai-input / ai-train)
- HTTP header —
X-Robots-Tag: noai, noimageai - HTML meta — robots
noai, noimageai - /.well-known/tdmrep.json — W3C Text and Data Mining Reservation Protocol
- EU Directive 2019/790 rights reservation
Everything is opt-in and does not change your robots.txt output unless you enable it.
The signals you control
You set three preferences, and AI Content Signals expresses each one in the right format on every surface you enable:
- search — allow or deny search indexing and traditional search results (links and short snippets)
- ai-input — allow or deny using your content for real-time AI answers (RAG, grounding, AI Overviews)
- ai-train — allow or deny using your content to train or fine-tune AI models
These three come from Cloudflare’s Content Signals vocabulary, written to your robots.txt. When you deny AI training, the same opt-out is also emitted on any other surface you enable: noai, noimageai in an HTML meta tag and an X-Robots-Tag header, plus a tdm-reservation in your TDMRep manifest. Declaring the same preference in several places means a crawler that ignores one signal may still honor another.
Key Features
- Easy-to-use settings page in WordPress admin
- Set global defaults for all crawlers
- Configure specific settings for individual AI bots (GPTBot, ClaudeBot, PerplexityBot, etc.)
- Add custom bot User-Agents
- Supports both physical and virtual robots.txt files
- Option to create physical robots.txt with basic WordPress rules
- Preview generated Content Signals before applying
- Export and import settings as JSON for easy migration between sites
- Optional legal text with EU Directive reference
- Developer-friendly: filter hook to extend the predefined bots list
- Works with existing robots.txt from SEO plugins
- Automatic sitemap detection and inclusion
- Optional extra output surfaces: HTML robots meta tag (noai, noimageai), X-Robots-Tag header, and a W3C TDMRep file at /.well-known/tdmrep.json
Supported Bots
The plugin includes predefined settings for 28 major AI crawlers:
- OpenAI GPTBot, OAI-SearchBot, and ChatGPT-User
- Anthropic ClaudeBot, Claude-Web, and anthropic-ai
- Perplexity Bot and Perplexity-User
- Google Extended (Gemini) and GoogleOther
- Amazon Bot
- Apple Extended
- Meta/Facebook Bot and meta-externalagent
- DuckDuckGo DuckAssistBot
- Allen Institute AI2Bot
- Mistral AI
- ByteDance Bytespider
- DeepSeek AI
- xAI Grok
- Huawei Pangu
- Common Crawl, Cohere AI, Diffbot, You.com Bot, and more
Important Notice
Content Signals is a declarative standard – it expresses your preferences but does not technically enforce them. AI companies are not legally required to respect these signals, though the plugin includes legal text referencing EU copyright directives.
The IETF AI Preferences (AIPREF) Working Group is currently developing a formal standard based on similar concepts. This plugin implements the current Cloudflare Content Signals specification and will be updated as standards evolve.
This plugin works best when combined with other protection measures like traditional robots.txt rules and server-level bot management.
Support
Need help or have suggestions?
Love the plugin? Please leave us a 5-star review and help spread the word!
About AyudaWP.com
We are specialists in WordPress security, SEO, and performance optimization plugins. We create tools that solve real problems for WordPress site owners while maintaining the highest coding standards and accessibility requirements.
