How to Write JSON-LD Structured Data for AI Search From Scratch
By Digital Strategy Force
JSON-LD structured data is the primary language AI search engines use to resolve entity authority and citation confidence. This guide walks through every layer — from field anatomy to nesting strategy to validation — so you can write markup that gets your content cited, not just indexed.
JSON-LD Anatomy: What Every Field Actually Does
Every morning, hundreds of millions of people fire questions at AI assistants that have never opened a search results page in their lives. The answers those assistants give come from structured signals embedded in web pages — and JSON-LD is the language those signals are written in. This Digital Strategy Force guide breaks down what each field is communicating and why getting it wrong can actively reduce your chances of being cited by an AI model.
JSON-LD stands for JavaScript Object Notation for Linked Data. It sits inside a <script type="application/ld+json"> tag in your page's <head>, completely decoupled from the visual HTML. That separation is architecturally significant: AI crawlers can parse your structured data without rendering your page, which means even a JavaScript-heavy site can communicate entity signals cleanly if the JSON-LD block is correct. The format is Google's preferred implementation method, as documented in the Google Search Central structured data guide, and it has become the dominant format on the web by a wide margin.
The core of every JSON-LD block is the triple: @context, @type, and the properties that describe the entity. The @context field tells the parser which vocabulary you are using — nearly always https://schema.org. The @type field specifies what kind of entity this is — Article, Organization, FAQPage, Product, and so on. Every other property is a description of that entity, and the richer those descriptions are, the more clearly the AI model can place your content in its knowledge graph.
JSON-LD has won the format war decisively. Microdata and RDFa have both declined in adoption, while JSON-LD climbed from 34% of pages in 2022 to 41% in 2024 according to the 2024 Web Almanac by HTTP Archive — the fastest growth rate of any structured data format measured. The implication for anyone writing structured data from scratch is clear: JSON-LD is the dialect AI systems have been trained on, and choosing any other format means speaking a language the machines are forgetting.
Choosing the Right Schema Type
The single most common mistake in structured data implementation is mismatching the schema type to the actual content. Marking a service page as an Article confuses AI parsers that expect article-specific properties like datePublished and author. Marking a FAQ section as a WebPage wastes the opportunity to surface individual question-answer pairs directly in AI responses. Every content type has a schema type that fits it best, and Digital Strategy Force's implementation process always starts with this mapping exercise before a single property is written.
Schema.org defines over 800 types and 1,400 properties, but in practice, fewer than twenty types account for the vast majority of AI-relevant structured data. The question is not which types exist — it is which types correspond to your specific content and what properties within those types carry the most signal weight for AI systems. Choosing the closest match at the leaf level of the type hierarchy always outperforms choosing a broader parent type. Auditing your existing structured data for AI readiness is the prerequisite step before any new implementation.
| Content Type | Recommended @type | Critical Properties | AI Signal Priority |
|---|---|---|---|
| Blog post / editorial | Article | headline, author, datePublished, publisher | Very High |
| FAQ page / FAQ section | FAQPage | mainEntity, Question, acceptedAnswer | Very High |
| Homepage / brand page | Organization | name, url, logo, sameAs, knowsAbout | Very High |
| How-to guide / tutorial | HowTo | name, step, totalTime, tool | High |
| Product / service page | Product / Service | name, description, offers, review | High |
| Person / author bio | Person | name, jobTitle, worksFor, sameAs | High |
| Event | Event | name, startDate, location, organizer | Medium |
Writing Valid JSON-LD From Scratch
Validity is binary in structured data: either the parser can read your JSON without errors, or it cannot. A single misplaced comma, an unclosed brace, or a string value where a URL is expected can silently invalidate an entire block. The good news is that JSON-LD follows strict but learnable syntax rules, and once you understand them, writing correct markup from scratch takes less time than debugging broken markup generated by a plugin.
Every JSON-LD block must be wrapped in a <script type="application/ld+json"> tag. Inside that tag is a JSON object — curly braces containing key-value pairs. String values must be quoted with double quotes. Arrays use square brackets. Nested objects use additional curly braces. The @context and @type keys must always appear first. No trailing commas after the last property in any object or array. These constraints are not unique to JSON-LD — they are standard JSON rules — which means any JSON linter can catch syntax errors before you publish.
"The difference between JSON-LD that gets your content cited and JSON-LD that gets ignored is not complexity — it's precision. AI parsers reward specificity at every property level, not volume of markup."
— Digital Strategy Force, Technical Architecture Division
Code Templates by Page Type
The following templates represent the minimum viable JSON-LD for each major content type. Each property shown is either required for validation or carries significant weight in AI entity resolution. Properties marked as optional but recommended should be included whenever the data is available.
Placement and Injection Strategies
Where JSON-LD lives in your HTML matters more than most guides acknowledge. The spec permits placement in either the <head> or the <body>, but AI crawlers operating under tight time budgets consistently parse head-located blocks first. Placing your primary entity schema — particularly Organization and Article types — in the document head ensures they are captured even if the crawler terminates before finishing the body.
For dynamic sites running React, Vue, or Angular, the injection strategy depends on your rendering architecture. Server-side rendered frameworks like Next.js and Nuxt allow you to inject JSON-LD directly into the head at build time or request time, which is ideal. Client-side-only apps must pre-render structured data or inject it via server middleware — JavaScript-executed JSON-LD that arrives after the initial HTML response is frequently missed by AI crawlers. This is one of the core arguments for static site generation in content-heavy contexts, and it is an architectural consideration Digital Strategy Force evaluates during every technical audit.
Multiple JSON-LD blocks on a single page are fully valid. A blog post might carry an Article block, a FAQPage block for the Q&A section, and a BreadcrumbList block for navigation. Each block is evaluated independently. The only constraint is consistency: properties in your JSON-LD must match the visible content on the page. AI systems cross-reference structured claims against page content, and discrepancies are flagged as trust signals against the source.
- 1. Document <head> — always first
- 2. Before closing </body> — acceptable
- 3. Inline at content section — avoid
- ✗ Injected after DOM load — critical fail
- ✓ Static HTML — optimal
- ✓ SSR (Next.js, Nuxt) — optimal
- ~ WordPress plugins — review output
- ✗ Client-side SPA only — rebuild required
- ✓ 1–3 blocks — standard range
- ✓ 4–5 blocks — fine with purpose
- ~ 6–8 blocks — review for conflicts
- ✗ Duplicate @type on same page — invalid
Entity Nesting and Graph Construction
Flat JSON-LD tells AI systems what something is called. Nested JSON-LD tells them who made it, when, where it exists, what it relates to, and how confident the publisher is in that information. The difference between a flat Article block and a properly nested one with embedded Person and Organization entities is the difference between an anonymous data point and a citable source with verified provenance.
Nesting works by placing a complete entity object as the value of a property rather than a string. The author property of an Article, for example, can take a string ("John Smith") or a full Person object with name, url, jobTitle, and sameAs links. The nested form enables AI systems to build a graph edge between your article and an identifiable human entity, which dramatically increases the model's confidence in attributing the content to a credible source. This is the mechanism behind Entity Salience Engineering: How to Make AI Models Prioritize Your Brand — the practice of systematically building graph edges that AI models can traverse.
The @id property is particularly powerful for graph construction. When you assign a unique URL identifier to an entity — such as using your homepage URL as the @id for your Organization — other JSON-LD blocks across your site can reference that same @id instead of rewriting the full object. This creates a web of internally consistent entity references that AI parsers can map as a coherent knowledge graph rather than a collection of isolated data points.
| Property | Flat (string) | Nested (object) | AI Graph Benefit |
|---|---|---|---|
| author | "John Smith" | Person { name, url, sameAs } | Connects article to named entity node |
| publisher | "Brand Name" | Organization { name, url, logo } | Establishes brand provenance for content |
| image | "https://…/img.jpg" | ImageObject { url, width, height } | Enables rich snippet image display |
| offers | "$99/mo" | Offer { price, priceCurrency, availability } | Machine-readable pricing for AI summaries |
| address | "123 Main St, City" | PostalAddress { street, city, country } | Local entity resolution for maps/voice |
Validation, Debugging, and Testing
Publishing JSON-LD without validating it first is equivalent to deploying code without running tests. Errors that look minor in isolation — a misquoted URL, a missing required property, a type value that Schema.org does not recognize — produce silent failures that result in your structured data being ignored entirely by AI crawlers. Validation is not optional. It is the last gate before any JSON-LD block goes live.
Google's Rich Results Test is the primary validation tool. Paste your URL or raw JSON-LD code and it will identify syntax errors, missing required properties, and type mismatches. For properties not yet in Google's documentation, the Schema.org Markup Validator provides the authoritative check against the full vocabulary. Run both tools on every implementation, because each checks slightly different aspects of validity.
Beyond syntax validation, semantic testing matters as much. Take your published page URL and directly query AI assistants — ChatGPT with browsing enabled, Perplexity, Google's AI Overview — with questions your content should answer. If the model cites your page, your structured data is working. If it cites competitors for questions you've answered in depth, your entity signals need strengthening. This manual spot-test cycle is part of Digital Strategy Force's ongoing monitoring protocol for every client's content portfolio.
Advanced Signals for AI Citation
Once your JSON-LD is syntactically valid and semantically complete, the next layer is building the external corroboration signals that AI models use to resolve authority. Structured data alone does not make a source citable — it makes it parseable. Citeability comes from the convergence of parseable structure, content depth, and cross-web entity consistency. The Google Search Central documentation reports that structured data implementations have consistently driven meaningful engagement lifts — the Food Network saw a 35% increase in visits after converting 80% of their pages to enable search features, and Rakuten found users spent 1.5 times longer on pages that carried rich structured data.
The sameAs property is the highest-value signal in the Organization schema for AI systems. It creates verifiable links between your JSON-LD entity and external identity nodes — Wikipedia, Wikidata, LinkedIn, Crunchbase, and industry directories. When an AI model encounters the same organization name across multiple authoritative sources that all point back to the same sameAs URL, the confidence score for that entity climbs significantly. Building and maintaining these cross-platform connections is a core practice in Cross-Platform Entity Consistency: Unifying Your Brand Across AI Models.
The speakable property marks sections of your content as optimized for voice-based AI delivery. This is increasingly relevant as voice assistants and voice-activated AI search grow. When you mark a concise, answer-formatted paragraph with a SpeakableSpecification, you are explicitly telling AI systems that this passage is pre-formatted for spoken delivery — clear, complete in one to three sentences, and accurate without visual context. It is one of the highest-leverage additions to content that already has strong FAQPage schema, and its implementation is detailed in depth in the guide on implementing speakable schema for voice-activated AI.
Frequently Asked Questions
Does JSON-LD directly affect AI search rankings?
JSON-LD does not influence traditional search rankings in the way keywords and backlinks do. What it does is help AI systems parse and trust your content. When an AI model resolves entity information, pages with correctly implemented structured data are consistently resolved with higher confidence, which increases their probability of being cited in AI-generated responses. The indirect citation effect is measurable and significant.
How many schema types should a single page use?
There is no hard cap, but every block should map to real content on the page. A typical blog post might carry Article, BreadcrumbList, and FAQPage — three blocks, each serving a distinct purpose. Adding blocks that do not correspond to visible content creates the kind of discrepancy that AI parsers flag as a trust issue. Quality and accuracy matter far more than quantity.
What happens if my JSON-LD has errors?
Minor errors — like an unrecognized optional property — typically cause that property to be ignored while the rest of the block is parsed normally. Critical errors — like a syntax-breaking missing brace or an invalid @type value — cause the entire block to be discarded. Google Search Console's Enhancement reports surface these errors after indexing, but catching them with the Rich Results Test before publishing avoids the indexing delay entirely.
Should I use structured data on every page of my site?
At minimum, every page should carry a WebPage or WebSite type. Beyond that baseline, the answer is yes with intention: every page that has a content type with a clear schema match — articles, FAQs, products, how-tos, events — should be marked up. Utility pages, login pages, and admin pages do not need structured data because they contain no entity information worth communicating to AI systems.
Is JSON-LD definitively better than Microdata or RDFa?
For new implementations, yes. Google explicitly recommends JSON-LD in all its structured data documentation, and its separation from HTML makes it easier to maintain, update, and debug without risking accidental changes to visible content. Microdata requires embedding attributes directly in HTML tags, which creates tight coupling between content and markup. RDFa has similar coupling problems and more complex attribute syntax. For sites already running Microdata, migration to JSON-LD on a rolling basis is the recommended path.
How often should JSON-LD blocks be reviewed and updated?
Every time the underlying page content changes significantly, the JSON-LD should be reviewed. The dateModified property should reflect actual updates — not be set once at publication and forgotten. Beyond content-triggered reviews, a quarterly audit of all structured data across the site catches deprecated properties (Schema.org removes and renames properties over time) and aligns your markup with any changes in AI platform preferences. Digital Strategy Force runs this audit as a standing component of every AEO engagement.
Next Steps
- ▶ Run Google's Rich Results Test on your five highest-traffic pages and document every error and warning before writing a single new line of markup.
- ▶ Add a nested
PersonorOrganizationobject to every Article block that currently uses a flat string for author or publisher, then re-validate. - ▶ Build a site-wide
Organizationblock with at least threesameAslinks to external identity sources (LinkedIn, Crunchbase, or industry directories) and deploy it on every page via a shared template. - ▶ Convert each FAQ section of your cornerstone articles to a proper
FAQPageblock with completeQuestionandacceptedAnswerpairs that mirror the visible question text exactly. - ▶ After deploying each new block, spot-test within 72 hours by querying Perplexity and ChatGPT with questions your page answers — document citation rate changes to build an internal performance baseline.
Need a complete structured data strategy built for AI-first search visibility, not just technical compliance? Explore Digital Strategy Force's Answer Engine Optimization services and see how full-spectrum AEO — from JSON-LD architecture to entity graph construction — translates into measurable citation growth.
