Enter your URL and we'll analyze your website to generate a ready-to-use siteai.json file.
Analyzing...
siteai.json filehttps://yoursite.com/siteai.jsonrobots.txt:
SiteAI: https://yoursite.com/siteai.json
<head>:
<link rel="siteai" type="application/json" href="/siteai.json">
First time using siteai.json? Don't worry — here are the most common scenarios with step-by-step instructions.
Does the order of fields matter? No. JSON is unordered — you can put identity at the top or bottom, it makes no difference.
Do I need all sections? No. Only specVersion, identity, and permissions are required. Everything else is optional.
What happens if an agent ignores my policy? Like robots.txt, compliance is voluntary for well-behaved agents. You can enforce it server-side via rate limiting, WAF rules, or User-Agent blocking.
Can I test my file? Upload to your site root and visit https://yoursite.com/siteai.json — it should render as valid JSON.
Where: ⚡ Action Permissions → 📝 Contact Form
What to do:
Result in JSON:
"permissions": {
"action": {
"submitContactForm": {
"allowed": false,
"note": "Contact form submissions require a real human."
}
}
}
Where: ⚡ Action Permissions → 🔍 Search
What to do:
20Result in JSON:
"permissions": {
"action": {
"search": {
"allowed": true,
"rateLimit": 20
}
}
}
Where: ⚡ Action Permissions → 💳 Checkout
What to do:
redirect-to-browsercheckoutWhat this means: An AI agent can add items and proceed to checkout, but the final purchase must be confirmed by the human user (e.g., redirected to a browser window).
"permissions": {
"action": {
"checkout": {
"allowed": true,
"humanVerification": true,
"note": "Final purchase requires human confirmation."
}
}
},
"humanVerification": {
"methods": ["redirect-to-browser"],
"requiredFor": ["checkout"]
}
Where: 🔗 Discovery Links
What to do:
https://yoursite.com/api/v1/openapi.jsonWhy? This tells AI agents where to find your structured API, so they can interact with your service programmatically instead of scraping your website.
"discovery": {
"robotsTxt": "https://yoursite.com/robots.txt",
"openApi": "https://yoursite.com/api/v1/openapi.json",
"mcpEndpoint": "https://yoursite.com/.well-known/mcp.json",
"schemaOrg": true
}
Where: 🤖 Agent Identification
What to do:
What this means: Well-behaved AI agents will send a User-Agent header identifying who they are, who operates them, and what they're trying to do. You can then decide per-agent what's allowed.
"agentIdentification": {
"requireUserAgent": true,
"requiredFields": ["agentName", "agentOperator", "agentPurpose"],
"allowAnonymousAgents": false
}
Where: 🕷️ Scraping Policies
What to do:
What this means: You're explicitly telling AI companies that your content must not be used for model training, reproduced elsewhere, or bulk-scraped. This is the digital equivalent of "All Rights Reserved" for AI.
"scraping": {
"bulkDataExtraction": false,
"contentReproduction": false,
"trainingDataUsage": false,
"priceMonitoring": false,
"competitiveAnalysis": false
}
Where: 🔒 Data Protection
What to do:
When would you turn one ON? Only if you have a trusted internal agent (e.g., your own CRM bot) that needs order history. In that case, combine with Agent Identification to whitelist specific agents.
"permissions": {
"data": {
"customerRecords": { "allowed": false },
"orderHistory": { "allowed": false },
"paymentInfo": { "allowed": false },
"internalAnalytics": { "allowed": false },
"employeeData": { "allowed": false }
}
}
Where: ⚖️ Legal
What to do:
minimal — Blog, portfolio (most websites)limited — E-commerce, SaaShigh — Healthcare, finance, education"legal": {
"termsUrl": "https://yoursite.com/legal/ai-terms",
"euAiActCompliance": {
"transparencyRequired": true,
"riskClassification": "limited",
"humanOversightMandatory": false
}
}
Links bearbeitest du die Policy strukturiert, rechts siehst und editierst du die echte JSON-Struktur.