Open Standard v1.0
A2WF defines what AI agents can find and do on your website, and what's strictly forbidden.
The AI Agent gets detailed information and strict rules from siteai.json. No more guessing, no more gray zones!
The difference between hoping AI agents behave and having proof when they don't.
| ❌ Without A2WF | ✅ With A2WF |
|---|---|
| Scraping is unwanted | Scraping is provably forbidden |
| Auto account creation just happens | Account creation explicitly prohibited & documented |
| Agents log in uncontrolled | Login rules machine-readable & dated |
| Bulk data extraction, no proof possible | Bulk extraction dated & forbidden in policy |
| Price monitoring by competitors | Price scraping policy-violating & provable |
| In court: "We didn't want this" | In court: "Here is the dated, machine-readable document" |
Without a machine-readable policy, you cannot prove an AI agent violated your rules. Your Terms of Service are written in natural language on a subpage. No agent reads them, no judge can prove the agent could have read them.
Today, your only option is robots.txt which says "please don't crawl." It's a suggestion. Agents ignore it.
A2WF says: "This behavior is forbidden under our documented policy, in these jurisdictions, and here are the legal grounds."
That turns a suggestion into evidence.
AI agents are visiting your website every day. They scrape your content, try to log in, create fake accounts, solve your CAPTCHAs, extract your pricing, and harvest your data for model training. They don't read your Terms of Service. They don't ask for permission. And right now, you have no way to prove they broke your rules.
Real threats that websites face today and how A2WF gives you provable, enforceable protection.
AI agents extract your entire product catalog, pricing, and content in minutes. Without A2WF: legal gray zone. With A2WF: provably forbidden.
Bots create hundreds of fake accounts on your platform. You block IPs, but you can't prove it was forbidden. A2WF makes it explicit: "createAccount": false.
Agents log in automatically and act on behalf of users. Your terms say "forbidden" but no agent reads terms. A2WF they do.
Your search function is abused as an API, thousands of queries per hour. Rate limits in A2WF are machine-readable and enforceable.
Competitors let agents monitor your prices in real time. "priceMonitoring": false, one entry that counts in a dispute.
LLM providers scrape your content for training data. "trainingDataUsage": false, clear, dated, provable.
Place an A2WF policy file on your web server. AI agents read it before interacting, a clear, enforceable rulebook for the AI era.
Get Started{
"@context": "https://schema.org",
"specVersion": "1.0",
"identity": {
"@type": "WebSite",
"name": "Example Store",
"category": "e-commerce"
},
"permissions": {
"read": {
"productCatalog": { "allowed": true },
"pricing": { "allowed": true }
},
"action": {
"search": { "allowed": true },
"checkout": {
"allowed": true,
"humanVerification": true
},
"createAccount": { "allowed": false },
"submitReview": { "allowed": false }
},
"data": {
"customerRecords": { "allowed": false },
"paymentInfo": { "allowed": false }
}
},
"scraping": {
"bulkDataExtraction": false,
"priceMonitoring": false,
"trainingDataUsage": false
}
}
Modern AI agents don't just crawl, they interact, transact, and extract. A2WF is the governance layer the web has been missing.
robots.txt can only say "crawl this" or "don't crawl this". No concept of actions, authentication, or rate limits.
Define what agents can read, what actions they can take, what data they can access, and under what conditions.
Websites can't describe themselves in a structured way that agents understand.
Name, category, language, contact, business hours, everything an agent needs to interact intelligently.
Create your siteai.json in minutes. Machine-readable. Legally relevant. Open standard.