A2WF Specification — Version 1.0

Status: Public Draft v1.0 (Core)
Version: 1.0
Date: 2026-03-18
Author: Wolfgang Wimmer / SSC Software Sales Consulting
Feedback: github.com/a2wf/spec/issues
License: MIT

1. Introduction

1.1. Abstract

This document defines the siteai.json format, Version 1.0, as part of the Agent-to-Web Framework (A2WF). It provides a machine-readable policy format for website operators to:

The format complements existing web standards like robots.txt, sitemap.xml, MCP (Model Context Protocol), A2A (Agent-to-Agent Protocol), and in-page Schema.org markup. It leverages Schema.org vocabulary where appropriate and introduces specific structures for AI agent governance that no existing standard provides.

Note: Optional site description extensions (keySections, mainContact, publisher, company, services, etc.) are defined in the companion document “A2WF Site Description Extensions v1.0” and are not part of this core specification.

1.2. Problem Statement

AI agents increasingly interact with websites — browsing products, comparing prices, booking appointments, filling forms, extracting data. Website operators face a critical gap:

No AI Agent Access Governance — No standard exists that gives the website operator a machine-readable way to declare:

Current agent-side standards (MCP, A2A, enterprise IAM) govern agents from the agent operator’s perspective. A2WF fills the gap by providing governance from the website operator’s perspective.

1.3. Relationship to Existing Standards

Standard Purpose Perspective Granularity
robots.txt Crawl permissions Website (binary) Allow/disallow per path
sitemap.xml URL listing Content URLs only
Schema.org Structured data Content (in-page) Entity descriptions
MCP Agent-to-tool connection Agent side Agent capabilities
A2A Agent-to-agent comms Agent side Skills & coordination
llms.txt Content guide for LLMs Content Curated page list
siteai.json Site governance WEBSITE OWNER Per-action permissions

1.4. Conventions

Keywords “REQUIRED”, “MUST”, “MUST NOT”, “SHOULD”, “RECOMMENDED”, “OPTIONAL” are per RFC 2119. Format: JSON (RFC 8259), UTF-8 encoded.

2. File Location and Discovery

AI agents MUST attempt discovery in this order:

  1. Root URL (preferred): https://{domain}/siteai.json
  2. robots.txt: SiteAI: https://example.com/siteai.json
  3. HTML Link: <link rel="siteai" type="application/json" href="/siteai.json">
  4. Well-Known URI: https://{domain}/.well-known/siteai.json

File Serving Requirements

3. Format Specification — Required Elements

3.1. Top-Level Structure

REQUIRED: specVersion (“1.0”), identity, permissions

RECOMMENDED: @context, agentIdentification, scraping

OPTIONAL: defaults, humanVerification, legal, discovery, metadata

Consumers MUST ignore any unrecognized keys (forward compatibility).

@context (RECOMMENDED)

The root object SHOULD include "@context": "https://schema.org". This enables interoperability with Schema.org vocabulary and JSON-LD processing tools.

3.2. identity Object (REQUIRED)

Provides core identifying and contextual information about the website.

3.3. permissions Object (REQUIRED)

Three sub-objects: read, action, data.

Permission Properties

Each permission is an object with:

Read Permissions (passive)

productCatalog, pricing, availability, openingHours, contactInfo, reviews, faq, companyInfo

Action Permissions (active)

search, addToCart, checkout, createAccount, submitReview, submitContactForm, bookAppointment, cancelOrder, requestRefund

Data Permissions (sensitive)

customerRecords, orderHistory, paymentInfo, internalAnalytics, employeeData

3.5. agentIdentification Object (RECOMMENDED)

3.6. scraping Object (RECOMMENDED)

4. Optional Governance Extensions

4.1. defaults Object

4.2. humanVerification Object

4.3. legal Object

4.4. discovery Object

4.5. metadata Object

5. Enforcement

5.1. Voluntary Compliance

Like robots.txt, A2WF relies primarily on voluntary compliance by reputable AI agents. Major agent vendors are expected to respect published policies as part of responsible AI deployment.

5.2. Technical Enforcement

Website operators MAY enforce policies through HTTP 403 responses, rate limiting, WAF rules, and User-Agent-based blocking.

5.3. Legal Enforcement

The legal.termsUrl field enables legal enforcement by linking to machine-readable policies. Courts have established precedent that violating machine-readable access policies can constitute unauthorized access. The EU AI Act (effective August 2026) requires transparency and risk management for AI systems.

5.4. Audit and Logging

Website operators SHOULD log agent access patterns and compare them against declared policies.

6. Security Considerations

7. Versioning and Extensibility

The specVersion field identifies the specification version. Major versions (2.0, 3.0) MAY introduce breaking changes. Minor updates within v1.x remain backward-compatible. Consumers MUST ignore unrecognized keys.

Future extensions may include: dynamic policy endpoints, signed policies, industry-specific profiles, and agent capability matching.

8. Schema.org Alignment

siteai.json Field Schema.org Equivalent
@context JSON-LD context
identity.@type schema:WebSite
identity.name schema:WebSite.name
identity.description schema:WebSite.description
identity.inLanguage schema:WebSite.inLanguage
identity.domain schema:WebSite.url
legal.termsUrl schema:WebSite.publishingPrinciples
permissions.* A2WF extension
scraping.* A2WF extension
agentIdentification.* A2WF extension
humanVerification.* A2WF extension

9. File Ecosystem

File Purpose Since
/robots.txt Crawl permissions 1994
/sitemap.xml URL listing for search engines 2005
/llms.txt Content guide for LLMs 2024
/.well-known/mcp.json MCP server discovery 2024
/siteai.json AI agent access governance (A2WF) 2025

10. Complete Example

{
  "@context": "https://schema.org",
  "specVersion": "1.0",
  "identity": {
    "@type": "WebSite",
    "domain": "https://www.example-store.com",
    "name": "Example Online Store",
    "description": "Premium widgets and gadgets",
    "purpose": "E-commerce store selling premium widgets to EU consumers.",
    "inLanguage": "en",
    "category": "e-commerce",
    "jurisdiction": "EU",
    "applicableLaw": ["EU AI Act", "GDPR"],
    "contact": "ai-policy@example-store.com"
  },
  "defaults": {
    "agentAccess": "restricted",
    "requireIdentification": true,
    "maxRequestsPerMinute": 30,
    "respectRobotsTxt": true
  },
  "permissions": {
    "read": {
      "productCatalog": { "allowed": true, "rateLimit": 60 },
      "pricing": { "allowed": true },
      "availability": { "allowed": true, "rateLimit": 30 },
      "reviews": { "allowed": true, "rateLimit": 20 },
      "faq": { "allowed": true }
    },
    "action": {
      "search": { "allowed": true, "rateLimit": 20 },
      "addToCart": { "allowed": true },
      "checkout": {
        "allowed": true,
        "humanVerification": true,
        "note": "Final purchase requires human confirmation."
      },
      "createAccount": { "allowed": false },
      "submitReview": { "allowed": false }
    },
    "data": {
      "customerRecords": { "allowed": false },
      "paymentInfo": { "allowed": false },
      "internalAnalytics": { "allowed": false }
    }
  },
  "scraping": {
    "bulkDataExtraction": false,
    "priceMonitoring": false,
    "trainingDataUsage": false
  },
  "agentIdentification": {
    "requireUserAgent": true,
    "requiredFields": ["agentName", "agentOperator"],
    "allowAnonymousAgents": false
  },
  "humanVerification": {
    "methods": ["redirect-to-browser"],
    "requiredFor": ["checkout"]
  },
  "discovery": {
    "robotsTxt": "https://www.example-store.com/robots.txt",
    "llmsTxt": "https://www.example-store.com/llms.txt",
    "schemaOrg": true
  },
  "legal": {
    "termsUrl": "https://www.example-store.com/legal/ai-terms",
    "euAiActCompliance": {
      "transparencyRequired": true,
      "riskClassification": "limited",
      "humanOversightMandatory": false
    }
  },
  "metadata": {
    "author": "Example Store Legal Team",
    "lastUpdated": "2026-03-18"
  }
}

11. References

Full specification, JSON Schema, and examples: github.com/a2wf/spec