The following information covers search indexation behavior by website crawlers, and what website factors those crawlers seek out to index your website effectively.
I would think of this as the closest form of SEO to web development, as many of theses changes and checks need be performed on the server side, by dev teams.
The following Google Sheet includes a list of checks that should be performed for website performance benchmarks and audits. Please use at will!
The following headers each correspond to category of checks which are found in the document above. Each category speaks to a broad set of checks depending on need.
This category is more inclined to favor website crawlers, but the understanding should be mentioned here; if you’re optimizing for website crawlers, you are also optimizing for humans – after all the robots lead us to the watering holes, as it were.
Specifically, this section refers to server-side optimizations and files (and processes) which can make your website “more discoverable” by website indexation crawlers (like the one Google uses, or the one Bing uses, or the one… you get it).
When you create a website (and you probably will), you might be disappointed with how slowly it can move to gain traffic, even when the content of the site is well crafted, well designed, and proper according to webmaster guidelines.
The “Findability” section of this report shows strong methods for overcoming this spin-up speed, specifically; sitemap management and declaration, robots.txt file inclusions, sitemap and robots submissions to indexers, etc.
Duplication is a bit of a doozy, and it will ALWAYS sneak up on you. This is an issue that is easy to avoid on smaller domains, but much harder for eCommerce focused businesses or large enterprise websites where content may overlap messily (or worse – intentionally).
Again – this will be solved on the server side mostly, with a few “on-page” settings worth mentioning.
When you have duplicate content (whether duplicated from other pages on your own website, or copy/pasted content from somewhere else), indexers will determine which one is the “original” via metadata and posting date/time. Once the original is identified, the other will be deprecated (sometimes even dropped from indexation).
As a result, your entire domain may incur penalties based on duplication of third party content, or your main page (which was duplicated, whoops!) might be deprioritized in favor of a different page on your website with far less value.
It is up to you (or your webmaster) to identify these issues, correct them, and resubmit for consideration by Google. I know everyone is excited about AI and the “revolutionary change” it will bring, but this needs to be handled by a human with value assessment skills, or by a complex set of rules for an AI (which… costs quite a lot of money for development).
The duplication section of this reports shows clear and simple ways of avoiding duplicate content, especially using “canonicals” which allow you to manually demarcate which pages are the “OG” on your domains!
This is a kick in the shins for most of us, because it calls us all back to our research writing classes from a time when research was a dirty word (in school).
Informational Architecture and Website Architecture are two sides of the same coin, in that they both pay the claw-machine that is Google. Utilizing both sides is just more leverage in your pocket (like running the same quarter twice).
So then, what exactly are talking about here?
Website Architecture is the planned structure of a website and it’s interlinked pages, in which the internal links between pages create reinforcement, and pages create pillars to support the overall structure.
Imagine building a house – it would be a really unfortunate decision to put your front door in such a place that causes your guests to walk through the only bathroom!
This is easiest to plan in a visual way, with lines drawn to connect directories and pages as blocks. We also like the “tree view”, in which each page is just a labelled dot (better on computers, hard to draw freehand).
The more planning up front, the less revision down the road when “scalability” becomes the word of the day.
Information Architecture refers to a similar structure, but within each individual page. This is what calls us back to research writing – this always ends up looking like a research paper outline in which you create headings and subheadings, and then flesh out your argument with information.
As an example, leverage keywords in your H1, H2, H3, H4, and as a part of image names, alt text, etc. Each external link to a learning resource is a “source citation” (no MLA here!).
Your page outlines will start to look like this:
This structure allows Google (and other website crawlers) make sense of complex information on a page, in the same way that research paper formatting calls attention to centrals points and arguments.
This is a short check – and it seems stupidly simple. You’d be amazed to find out how few people consider this “helpful”, even when they have a clear problem with utilizing this technique.
Be. Specific.
When you create a page (or heck, when you buy a domain!) you should be intending to use that page or site for a specific purpose.
So – use the domain and the URLs that you create to help reinforce your topical relevance!
Example: I am currently writing on a domain with “marketing” in the address, and the slug includes my topic of conversation – “technical SEO”. No words that are heavily unrelated to our topic here!
This will be separated into two categories, as internal and external links serve (slightly) different end goals – and it’s important to know the distinction.
Broadly – linking is what connects different pages on the internet – without using both internal links and external links, your website performance will struggle.
I could split this up again, but let’s not. Basically there are two forms of links on a page (or on a domain, still true):
Inbound External Links are what SEO’s call “Backlinks” – these are the lifeblood of search engine activities for SEO’s and index-sites like Google.
Inbound External Links (Backlinks) provide the end-page (destination) with a trust signal – it means someone out there on the internet said “this is the resource I trust for this information”. Google likes this a lot for domain and page authority scoring.
Outbound External Links are links which direct people off of your domain to authorities on whatever topic you may be discussing (Google for Search, Yahoo Finance for Investing, etc). These would be good for building your website’s authority with citations for research or conversation.
Outbound External Links (Citations) provide the external end-page (destination) with a trust signal, but it also reminds Google that you use authoritative sources during your production of content and website development. This is an authority signal for your domain, and the domain you link to.
I should mention a couple of things that can change how this works;
These link attributes (or page attributes) tell crawlers not to follow links, or not to pass along positive rank signals from your outbound external links.
For many years (the last 10 years or so, following the link-farm era of Google), the “best practice” commonly shared was to use ‘noreferrer’ (or ‘nofollow’) on every link on the site, regardless of destination.
Google has explicitly stated a different use-case – only leverage noreferrer for websites you’re unsure about, and otherwise allow the follow and referrer to proceed unaltered.
This is largely due to a shift in Google’s priority – we have enough content on the internet – we need to start prioritizing well created, well sourced, and authoritative content.
This all relates back to the rollout of natural language processing (BERT) and the E-E-A-T model for content production, which Google has been pushing hard for since the Mobile First Indexing implementation in 2019.
Internal Linking works a bit differently, but the broad strokes are the same – there are outbound and inbound links. For Internal links, both inbound and outbound will be from your domain (i.e. you’re linking from your homepage to your services page).
Internal Linking provides a roadmap for website crawlers about page content priorities on the site – if a page has 1/50 of the links that another page, Google will treat the page with fewer links as less important (because literally, the concentration of internal links is how you communicate your intended value of the pages on your website.
The largest point here is; make sure any page that is included in your Top Navigation menu is actually deserving of that importance within your business model. And if it is, make sure your dropdowns and top navigation structure provides the easiest possible navigation experience.
This is a relatively new shift away from heavily technical Page Speed reporting, since 2020. We used to have to dig through massive reports about code priority and page timings – this has since been simplified to something called Core Web Vitals.
Core Web Vitals is a scoring rubric which Google has implemented for measuring how well your website performs for people – things like:
These metrics refer to aspects of visual design and interactive design which allows Google to make general assessments about how pleased website visitors will be with your website’s build. To be clear – Google measures this based on actual user interactions – testing is never 1:1.
Broadly, this comes down to a content delivery network (CDN), for which there are many third party providers with excellent track records.
My personal favorite is NitroPack (used on this site) because they are actively in partnership with Google over cutting edge Core Web Vitals automated optimizers. Others would be HummingBird, WPRocket, WP Smush, etc.
This is the bread and butter activity for most SEOs on the planet Earth, as it correlates directly to some of the things we’ve already discussed, and provides the practical “application” of some of these concepts.
On-page SEO includes the following subcategories:
This is the practical application category, in which knowledge of linking and architecture become handy!
“Should a link be included? Should my heading be X or Y? Do I need images or video to illustrate my point more clearly?”
There is a lot of good content out on the web to help learn about On-Page SEO, and ranking factors for content generation and planning – we recommend the AHREFs course.
Page Title (60 Characters including spaces)
Meta Descriptions (155 characters with spaces)
Headers (make sure H1 exists)
There are three main formats:
Of those three types, Google has explicitly stated that they prefer websites to deploy JSON-LD instead of the other two formats.
This is related to something that I call Entity Management – Google and other Tech companies have been working together to create and standardize data markup for web indexation.
Rather than relying purely on links and keywords (which can be gamed as we’ve seen), the future of the internet looks like Wikipedia – one unified database (in terms of structure for ease of access).
Entity management is a big topic – but sufficed to say that its here to stay, and it provides most of the “easy” search information that people access in their day to day lives!
More information about schema markups, data unification and standardization on the schema website.
This is where keywords will still always have their place – in the written information on your website. This is especially powerful if you run websites that use content to gain traffic and monetize based on impressions for ad networks – but I digress.
Knowing how to leverage all of the technical pieces of content will put your website and business miles ahead of competitors.
Each page can include everything noted in this section (title, description, URLs, internal and external links, documents, resources, downloadable content, schema, etc.), and blogs can become internet traffic institutions with a bit of elbow grease and the right planning & agility in production.
Throw in comprehensive topic coverage, FAQs for each article, and links to relevant “further reading” articles, and you’re already building a strong content base (which is also what this article is for)!
Images and Videos (and most other rich media, particularly visual imagey) is still in development in terms of “high quality readability” by search engines, but they’re learning fast, and we’re helping!
Part of the strategy is to associate data points (hooks) with surface level positions in the video timeline with which to associate meaning and “readable value”.
So, when you add chapters to your video on YouTube – congrats, you’re adding hooks that makes search engines parse your video files more effectively!
It also helps with ADA compliance for readability for humans with visual impairments or other difficulties with reading webpages!
Other “hooks” to add to rich media for enhanced “advanced” SEO:
This is the most powerful piece of information in this report; have you set up Google Search Console for your domain?
I don’t care if it uses DNS for verification (best), or if it uses a HTML tag, or a Tag Manager tag, or whatever else.
As long as you have access to the free webmaster tools from Google in order to better advocate for your domain within the search index, you are in an excellent position to drive meaningful growth online.
If you or someone you know is suffering from reading this – or otherwise having to engage with such inane internet jargon – you may be sitting on an opportunity gold mine. Contact us to find out more about your business growth opportunities!
Subscribe now to keep reading and get access to the full archive.