Three Representations
We tested three ways to represent the same web page to an AI agent:
Raw HTML — what agents receive today. Full markup with every CSS class, ARIA attribute, and layout wrapper.
Markdown — a cleaned text extraction. Smaller, but loses interactive elements, page structure, and action metadata.
SOM (Semantic Object Model) — a structured representation that preserves meaning, hierarchy, and interactivity while stripping presentation. What Plasmate produces.
Token Results
Structured representations reduce input tokens by 4x compared to raw HTML. This translates directly to cost and latency savings — every token the model doesn't process is money not spent and time not wasted.
Accuracy by Category
The biggest gains are in navigation (+26 points) and interactive element identification (+35 points) — precisely the categories where presentation markup creates the most noise. Adversarial resistance also improves significantly because structured formats strip the deceptive markup that confuses agents.