SEO is often considered the snake oil of the web. How many times have you scrolled through attention-grabbing headlines on know how to improve your SEO? Everyone and their uncle seems to have some “magic” cure to land high in search results and turn impressions into conversions. Sifting through so much noise on the topic can cause us to miss true gems that might be right under our nose.
We’re going to look at one such gem in this article: structured data.
There’s a checklist of SEO must-haves that we know are needed when working on a site. It includes things like a strong <title>
, a long list of <meta>
tags, and using descriptive alt
attributes on images (which is a double win for accessibility). Running a cursory check on any site using Lighthouse will flag up turn up even more tips and best practices to squeeze the most SEO out of the content.
Search engines are getting smarter, however, and starting to move past the algorithmic scraping techniques of yesteryear. Google, Amazon, and Microsoft are all known to be investing a considerable amount in machine learning, and with that, they need clean data to feed their search AI.
That’s where the the concept of schemas comes into play. In fact, it’s funding from Google and Microsoft — along with Yahoo and Yandex — that led to the establishment of schema.org, a website and community to push their format — more commonly referred to as structured data —forward so that they and other search engines can help surface content in more useful and engaging ways.
So, what is structured data?
Structured data describes the content of digital documents (i.e. websites, emails, etc). It’s used all over the web and, much like <meta
> tags, is an invisible layer of information that search engines use to read the content.
Structured data comes in three flavors: Microdata, RDFa and JSON-LD. Microdata and RDF are both injected directly into the HTML elements of a document, peppering each relevant element of a page with machine readable pointers. For example, an example of using Microdata attributes on a product, taken straight from the schema.org docs:
<div itemscope itemtype="http://schema.org/Product">
<span itemprop="name">Kenmore White 17" Microwave</span>
<img itemprop="image" src="kenmore-microwave-17in.jpg" alt='Kenmore 17" Microwave' />
<div itemprop="aggregateRating"
itemscope itemtype="http://schema.org/AggregateRating">
Rated <span itemprop="ratingValue">3.5</span>/5
based on <span itemprop="reviewCount">11</span> customer reviews
</div>
<div itemprop="offers" itemscope itemtype="http://schema.org/Offer">
<!--price is 1000, a number, with locale-specific thousands separator
and decimal mark, and the $ character is marked up with the
machine-readable code "USD" -->
<span itemprop="priceCurrency" content="USD">$</span><span
itemprop="price" content="1000.00">1,000.00</span>
<link itemprop="availability" href="http://schema.org/InStock" />In stock
</div>
Product description:
<span itemprop="description">0.7 cubic feet countertop microwave.
Has six preset cooking categories and convenience features like
Add-A-Minute and Child Lock.</span>
Customer reviews:
<div itemprop="review" itemscope itemtype="http://schema.org/Review">
<span itemprop="name">Not a happy camper</span> -
by <span itemprop="author">Ellie</span>,
<meta itemprop="datePublished" content="2011-04-01">April 1, 2011
<div itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating">
<meta itemprop="worstRating" content = "1">
<span itemprop="ratingValue">1</span>/
<span itemprop="bestRating">5</span>stars
</div>
<span itemprop="description">The lamp burned out and now I have to replace
it. </span>
</div>
<div itemprop="review" itemscope itemtype="http://schema.org/Review">
<span itemprop="name">Value purchase</span> -
by <span itemprop="author">Lucas</span>,
<meta itemprop="datePublished" content="2011-03-25">March 25, 2011
<div itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating">
<meta itemprop="worstRating" content = "1"/>
<span itemprop="ratingValue">4</span>/
<span itemprop="bestRating">5</span>stars
</div>
<span itemprop="description">Great microwave for the price. It is small and
fits in my apartment.</span>
</div>
<!-- etc. -->
</div>
If that seems like bloated markup, it kinda is. But it’s certainly beneficial if you prefer to consolidate all of your data in one place.
JSON-LD, on the other hand, usually sits in a <script>
tag and describes the same properties in a single block of data. Again, from the docs:
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "3.5",
"reviewCount": "11"
},
"description": "0.7 cubic feet countertop microwave. Has six preset cooking categories and convenience features like Add-A-Minute and Child Lock.",
"name": "Kenmore White 17\" Microwave",
"image": "kenmore-microwave-17in.jpg",
"offers": {
"@type": "Offer",
"availability": "http://schema.org/InStock",
"price": "55.00",
"priceCurrency": "USD"
},
"review": [
{
"@type": "Review",
"author": "Ellie",
"datePublished": "2011-04-01",
"description": "The lamp burned out and now I have to replace it.",
"name": "Not a happy camper",
"reviewRating": {
"@type": "Rating",
"bestRating": "5",
"ratingValue": "1",
"worstRating": "1"
}
},
{
"@type": "Review",
"author": "Lucas",
"datePublished": "2011-03-25",
"description": "Great microwave for the price. It is small and fits in my apartment.",
"name": "Value purchase",
"reviewRating": {
"@type": "Rating",
"bestRating": "5",
"ratingValue": "4",
"worstRating": "1"
}
}
]
}
</script>
This is my personal preference, as it is treated as a little external instruction manual for your content, much like JavaScript for scripts, and CSS for your styles, all happily self-contained. JSON-LD can become essential for certain types of schema, where the content of the page is different from the content of the structured data (for example, check out the speakable
property, currently in beta).
A welcome introduction to the implementation of JSON-LD on the web is Google’s allowance of fetching structured data from an external source, rather than forcing inline scripting, which was previously frustratingly impossible. This can be done either by the developer, or in Google Tag Manager.
What structured data means to you
Beyond making life easier for search engine crawlers to read your pages? Two words: Rich snippets. Rich snippets are highly visual modules that tend to sit at the top of the search engine, in what is sometimes termed as “Position 0” in the results — displayed above the first search result. Here’s an example of a simple search for “blueberry pie” in Google as an example:
Even the first result is a rich snippet! As you can see, using structured data is your ticket to get into a rich snippet on a search results page. And, not to spur FOMO or anything, but any site not showing up in a rich snippet is already at risk of dropping into “below the fold” territory. Notice how the second organic result barely makes the cut.
Fear not, dear developers! Adding and testing structured data to a website is aq simple and relatively painless process. Once you get the hang of it, you’ll be adding it to every possible location you can imagine, even emails.
It is worth noting that structured data is not the only way to get into rich snippets. Search engines can sometimes determine enough from your HTML to display some snippets, but utilizing it will push the odds in your favor. Plus, using structured data puts the power of how your content is displayed in your hands, rather than letting Google or the like determine it for you.
Types of structured data
Structured data is more than recipes. Here’s a full list of the types of structured data Google supports. (Spoiler alert: it’s almost any kind of content.)
- Article
- Book (limited support)
- Breadcrumb
- Carousel
- Course
- COVID-19 announcements (beta)
- Critic review (limited support)
- Dataset
- Employer aggregate rating
- Estimated salary
- Event
- Fact check
- FAQ
- How-to
- Image license metadata (beta)
- Job posting
- Local business
- Logo
- Movie
- Product
- Q&A
- Recipe
- Review snippet
- Sitelinks searchbox
- Software app
- Speakable (beta)
- Subscription and paywalled content
- Video
Yep, lots of options here! But with those come lots of opportunity to enhance a site’s content and leverage these search engine features.
Using structured data
The easiest way to find the right structured data for your project is to look through Google’s search catalogue. Advanced users may like to browse what’s on schema.org, but I’ll warn you that it is a scary rabbit hole to crawl through.
Let’s start with a fairly simple example: the Logo
logo data type. It’s simple because all we really need is a website URL and the source URL for an image, along with some basic details to help search engine’s know they are looking at a logo. Here’s our JSON-LD:
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Example",
"url": "http://www.example.com",
"logo": "http://www.example.com/images/logo.png"
}
</script>
First off, we have the <script> tag itself, telling search engines that it’s about to consume some JSON-LD.
From there, we have five properties:
@context
: This is included on all structured data objects, no matter what type it is. It’s what tells search engines that the JSON-LD contains data that is defined by schema.org specifications.@type
: This is the reference type for the object. It’s used to identify what type of content we’re working with. In this case, it’s “Organization” which has a whole bunch of sub-properties that follow.name
: This is the sub-property that contains the organization’s name.url
: This is the sub-property that contains the organization’s web address.logo
: This is the sub-property that contains the file path for the organization’s logo image file. For Google to consider this, it must be at least 112⨉112px and in JPG, PNG, or GIF format. Sorry, no SVG at the moment.
A page can have multiple structured data types. That means it’s possible to mix and match content.
Testing structured data
See, dropping structured data into a page isn’t that tough, right? Once we have it, though, we should probably check to see if it actually works.
Google, Bing, and Yandex (login required) all have testing tools available. Google even has one specifically for validating structured data in email. In most cases, simply drop in the website URL and the tool will spin up a test and show which object it recognizes, the properties it sees, and any errors or warning to look into.
The next step is to confirm that the structured data is accessible on your live site through Google Search Console. You may need to set up an account and verify your site in order to use a particular search engine’s console, but checking data is — yet again — as simple as dropping in a site URL and using the inspection tools to check that the site is indeed live and sending data when it is accessed by the search engine.
If the structured data is implemented correctly, it will display. In Google’s case, it’s located in the “Enhancements” section with a big ol’ checkmark next to it.
But wait! I did all that and nothing’s happening… what gives?
As with all search engine optimizations, there are no guarantees or time scales, when it comes to how or when structured data is used. It might take a good while before rich snippets take hold for your content — days, weeks, or even months! I know, it stinks to be left in the dark like that. It is unfortunately a waiting game.
Hopefully, this gives you a good idea of what structured data is and how it can be used to leverage features that search engines have made to spotlight content has it.
There’s absolutely no shortage of advice, tips, and tricks for helping optimize a site for search engines. While so much of it is concerned with what’s contained in the <head> or how content is written, there are practical things that developers can do to make an impact. Structured data is definitely one of those things and worth exploring to get the most value from content.
The world is your oyster with structured data. And, sure, while search engine only support a selection of the schema.org vocabulary, they are constantly evolving and extending that support. Why not start small by adding structured data to an email link in a newsletter? Or perhaps you’re into trying something different, like defining a sitelinks search box (which is very meta but very cool). Or, hey, add a recipe for Pinterest. Blueberry pie, anyone?