What is the Agent-to-Web Framework (A2WF)?
The Problem
As AI agents increasingly interact with websites — browsing products, comparing prices, booking appointments, filling forms — website operators face a critical gap: there is no standardized way to control what these agents may and may not do.
MCP defines how agents connect to tools. A2A defines how agents talk to each other. But neither asks: “What is this agent allowed to do on THIS website?”
robots.txt offers a binary choice: allow or disallow crawling. It knows nothing about AI agent actions like purchasing, booking, or data extraction.
The result: allow everything (risk abuse) or block everything (lose AI-driven discovery). There is no middle ground.
The Solution
A2WF fills this gap: a machine-readable JSON file called siteai.json that any website can host to declare its AI agent access policy.
- What agents are ALLOWED to do — read catalogs, search, check availability
- What agents MUST NOT do — bulk scrape, fake reviews, unauthorized transactions
- What requires HUMAN VERIFICATION — checkout, booking, contact forms
- How agents must IDENTIFY themselves — name, operator, purpose
- What RATE LIMITS are enforced — per minute, per hour, per action
https://example.com/siteai.json
The Analogy
“No photography. Service animals welcome. Maximum 2 items in fitting room.”
Rules at a store entrance don’t prevent entry — they make clear what’s acceptable. siteai.json does the same for AI agents.
Design Principles
- Trivially easy to create. If you can write robots.txt, you can create siteai.json.
- Complementary. Works alongside MCP, A2A, robots.txt, Schema.org.
- Website operator’s perspective. The first standard built from the site owner’s view.
- Legally referenceable. Integrates with ToS and regulatory frameworks.
- Progressively extensible. Start with a static file, grow to dynamic APIs.
Relationship to Schema.org
A2WF uses Schema.org vocabulary where applicable. The @context field links to schema.org, and the identity object uses Schema.org types like WebSite. This means:
- siteai.json files are valid JSON-LD documents — existing tooling can process them
- The vocabulary is familiar — developers who use structured data already know Schema.org
- A2WF extends Schema.org — rather than reinventing existing concepts
- Future-proof — as Schema.org evolves, A2WF benefits automatically
However, A2WF introduces concepts that Schema.org does not cover: granular per-action permissions, agent identification requirements, human-in-the-loop verification, scraping policies, and legal framework declarations. These are A2WF-specific extensions that complement the Schema.org foundation.
Who Should Use A2WF?
- E-commerce — Product discovery without bulk scraping
- Healthcare — Protect patient data, allow availability checks
- Finance — Public product info, no automated transactions
- News & Media — Reference headlines, block reproduction
- Agent developers — Clear, parseable rules
- Regulators — Machine-enforceable AI governance