JavaScript SEO Guide: All you need to know about fundamentals, issues

JavaScript SEO Guide: All you need to know about fundamentals, issues

Do you think there’s a chance that JavaScript issues are preventing your website or some of your content from appearing in Google Search?

Well, JavaScript includes numerous capabilities that transform the web into a strong application platform, since it is an essential element of the web platform. Making your web application JavaScript-powered boosts the searchable on Google can help you attract new users and re-engage existing ones as they look for the material your app delivers. While Google Search uses an evergreen version of Chromium to run JavaScript, there are a few things you can do to improve the performance.

This document explains how Google Search handles JavaScript, as well as recommended practises for optimising JavaScript web applications for Google Search.

How does Googlebot handle JavaScript?

Googlebot goes through three stages when it comes to JavaScript web apps:

-Crawling
-Rendering
-Indexing

A page is crawled, rendered, and indexed by Googlebot.

Googlebot queues pages for both crawling and rendering. It is not immediately obvious when a page is waiting for crawling and when it is waiting for rendering.

When Googlebot makes an HTTP request to get a URL from the crawling queue, it first checks to see if you allow crawling. Googlebot reads the robots.txt file. Googlebot skips sending an HTTP request to this URL and skips the URL if it is marked as forbidden.

After that, Googlebot parses the response for further URLs in HTML links’ href property and adds them to the crawl queue. Use the nofollow technique to avoid link discovery.

It’s fine to use JavaScript to inject links into the DOM, as long as such links follow the guidelines for crawlable links.

For traditional websites or server-side generated pages where the HTML in the HTTP response contains all content, crawling a URL and parsing the HTML response works effectively. Some JavaScript sites may utilise the app shell paradigm, in which the initial HTML does not include the real content and Googlebot must first execute JavaScript before seeing the page content generated by JavaScript.

Unless a robots meta tag or header informs Googlebot not to index the page, Googlebot queues all pages for rendering. The page may be held in this wait for a few seconds, but it might take much longer. A headless Chromium renders the page and executes the JavaScript once Googlebot’s resources allow it. Googlebot parses the generated HTML for links once again, adding the URLs it discovers to a queue for crawling. The produced HTML is also used by Googlebot to index the page.

Remember that server-side or pre-rendering is still a good idea because it speeds up your site for users and crawlers, and not all bots can execute JavaScript.

Use unique names and snippets to describe your page

Users may easily identify the best result for their objective by using unique, informative titles and useful meta descriptions, and we describe what makes effective names and descriptions in our standards.

The meta description and title may both be set or changed using JavaScript.

Google Search might show a different title or description based on the user’s query. This happens when the title or description have a low relevance for the page content or when we found alternatives in the page that better match the search query. Learn more about why the search result title might differ from the page’s <title> tag.

Create code that is compatible

Many APIs are available in browsers, and JavaScript is a rapidly developing language. When it comes to APIs and JavaScript capabilities, Googlebot has certain restrictions. Follow our JavaScript troubleshooting recommendations to ensure your code is compatible with Googlebot.

We recommend using differential serving and polyfills if you feature-detect a missing browser API that you need. Since some browser features cannot be polyfilled, we recommend that you check the polyfill documentation for potential limitations.

Use informative HTTP status codes 

When Googlebot crawls a page, it looks for HTTP status codes to see whether anything went wrong.

Use a relevant status code to inform Googlebot if a page can’t be crawled or indexed, such as 404 for a page that couldn’t be found or 401 for pages that need login. HTTP status codes can be used to inform Googlebot that a page has moved to a new URL, allowing the index to be updated accordingly.

The following is a list of HTTP status codes and their impact on Google Search.

In single-page applications avoid soft 404 errors

Routing is frequently done as client-side routing in client-side rendered single-page applications. Using meaningful HTTP status codes in this situation may be difficult or impractical. Use one of the following techniques to avoid soft 404 errors when utilising client-side rendering and routing:

  • Use a JavaScript redirect to a URL that receives a 404 HTTP status code from the server (for example, /not-found).
  • Using JavaScript, add a meta name=”robots” content=”noindex”> to error pages.

For the redirect technique, here is some sample code:

fetch(`/api/products/${productId}`)
.then(response => response.json())
.then(product => {
  if(product.exists) {
    showProductDetails(product); // shows the product information on the page
  } else {
    // this product does not exist, so this is an error page.
    window.location.href = '/not-found'; // redirect to 404 page on the server.
  }
})

Here’s some sample code for using the noindex tag:

fetch(`/api/products/${productId}`)
.then(response => response.json())
.then(product => {
  if(product.exists) {
    showProductDetails(product); // shows the product information on the page
  } else {
    // this product does not exist, so this is an error page.
    // Note: This example assumes there is no other meta robots tag present in the HTML.
    const metaRobots = document.createElement('meta');
    metaRobots.name = 'robots';
    metaRobots.content = 'noindex';
    document.head.appendChild(metaRobots);
  }
}) 

Instead of fragments, use the History API

Only URLs in the href property of HTML links are considered by Googlebot when looking for links on your pages.

Use the History API to create routing between various views of your web app in single-page apps using client-side routing. Avoid utilising fragments to load distinct page content to guarantee that Googlebot can discover links. Because Googlebot will not crawl the links in the following example, it is a poor practise:

<nav>
  <ul>
    <li><a href="#/products">Our products</a></li>
    <li><a href="#/services">Our services</a></li>
  </ul>
</nav>

<h1>Welcome to example.com!</h1>
<div id="placeholder">
  <p>Learn more about <a href="#/products">our products</a> and <a href="#/services">our services</p>
</div>
<script>
window.addEventListener('hashchange', function goToPage() {
  // this function loads different content based on the current URL fragment
  const pageToLoad = window.location.hash.slice(1); // URL fragment
  document.getElementById('placeholder').innerHTML = load(pageToLoad);
});
</script>

Instead, you may use the History API to ensure that link URLs are visible to Googlebot:

<nav>
  <ul>
    <li><a href="/products">Our products</a></li>
    <li><a href="/services">Our services</a></li>
  </ul>
</nav>

<h1>Welcome to example.com!</h1>
<div id="placeholder">
  <p>Learn more about <a href="/products">our products</a> and <a href="/services">our services</p>
</div>
<script>
function goToPage(event) {
  event.preventDefault(); // stop the browser from navigating to the destination URL.
  const hrefUrl = event.target.getAttribute('href');
  const pageToLoad = hrefUrl.slice(1); // remove the leading slash
  document.getElementById('placeholder').innerHTML = load(pageToLoad);
  window.history.pushState({}, window.title, hrefUrl) // Update URL as well as browser history.
}

// Enable client-side routing for all links on the page
document.querySelectorAll('a').forEach(link => link.addEventListener('click', goToPage));

</script> 

Use meta robots tags with caution
The meta robots tag can be used to prohibit Googlebot from indexing a page or following links. Adding the following meta tag at the top of your website, for example, prevents Googlebot from indexing it:

<!-- Googlebot won't index this page or follow links on this page -->
<meta name="robots" content="noindex, nofollow">

JavaScript may be used to add a meta robots tag to a website or modify its content. The sample code below illustrates how to use JavaScript to modify the meta robots tag to prohibit indexing of the current page if an API request fails to provide content.

fetch('/api/products/' + productId)
  .then(function (response) { return response.json(); })
  .then(function (apiResponse) {
    if (apiResponse.isError) {
      // get the robots meta tag
      var metaRobots = document.querySelector('meta[name="robots"]');
      // if there was no robots meta tag, add one
      if (!metaRobots) {
        metaRobots = document.createElement('meta');
        metaRobots.setAttribute('name', 'robots');
        document.head.appendChild(metaRobots);
      }
      // tell Googlebot to exclude this page from the index
      metaRobots.setAttribute('content', 'noindex');
      // display an error message to the user
      errorMsg.textContent = 'This product is no longer available';
      return;
    }
    // display product information
    // ...
  });
   

Before launching JavaScript, Googlebot detects noindex in the robots meta tag and does not render or index the page.

If Googlebot encounters the noindex tag, it skips rendering and JavaScript execution. Because Googlebot skips your JavaScript in this case, there is no chance to remove the tag from the page. Using JavaScript to change or remove the robots meta tag might not work as expected. Googlebot skips rendering and JavaScript execution if the meta robots tag initially contains noindex. If there is a possibility that you do want the page indexed, don’t use a noindex tag in the original page code.

Use cache that lasts a long time

To decrease network queries and resource use, Googlebot caches aggressively. Caching headers may be ignored by WRS. As a result, it’s possible that WRS will employ out-of-date JavaScript or CSS resources. By making a fingerprint of the content part of the filename, such as main, content fingerprinting overcomes this difficulty. 2bb85551.js. Because the fingerprint is determined by the file’s content, each change results in a new filename. For additional information, see the web.dev guide to long-lived caching techniques.

Make use of organised data

When using structured data on your sites, you may build the necessary JSON-LD with JavaScript and inject it into the page. To avoid problems, make sure to test your implementation.

Use best practises when it comes to web components

Web components are supported by Googlebot. The shadow DOM and light DOM content are flattened when Googlebot produces a page. This implies that just the material shown in the produced HTML is visible to Googlebot. Examine the rendered HTML with the Mobile-Friendly Test or the URL Inspection Tool to ensure that Googlebot can still view your content after it has been produced.

When it comes to web components, follow best practises

Googlebot understands web components. When Googlebot creates a page, the content of the shadow and light DOMs is flattened. This means that just the content displayed in the generated HTML is visible to Googlebot. To confirm that Googlebot can still read your content once it has been created, use the Mobile-Friendly Test or the URL Inspection Tool to examine the displayed HTML.

<script>
  class MyComponent extends HTMLElement {
    constructor() {
      super();
      this.attachShadow({ mode: 'open' });
    }

    connectedCallback() {
      let p = document.createElement('p');
      p.innerHTML = 'Hello World, this is shadow DOM content. Here comes the light DOM: <slot></slot>';
      this.shadowRoot.appendChild(p);
    }
  }

  window.customElements.define('my-component', MyComponent);
</script>

<my-component>
  <p>This is light DOM content. It's projected into the shadow DOM.</p>
  <p>WRS renders this content as well as the shadow DOM content.</p>
</my-component>
           

When it comes to web components, follow best practises

Googlebot understands web components. When Googlebot creates a page, the content of the shadow and light DOMs is flattened. This means that just the content displayed in the generated HTML is visible to Googlebot. To confirm that Googlebot can still read your content once it has been created, use the Mobile-Friendly Test or the URL Inspection Tool to examine the displayed HTML.

Fix photos and material that take a long time to load

Images may be extremely resource intensive in terms of bandwidth and performance. Using lazy-loading to only load pictures when the user is going to see them is a smart technique. Follow our lazy-loading recommendations to ensure you’re using lazy-loading in a search-friendly manner.

One thought on “JavaScript SEO Guide: All you need to know about fundamentals, issues

Comments are closed.

Schemas Aren’t Solely for Tech Pros: Myth Busted Schema Is Only Useful For Unstructured Data Schemas’ Indirect Impact on Ranking Schemas Ensure High Rankings: Myth & Facts List Of Schems That Not Supported By Google Anymore?
Schemas Aren’t Solely for Tech Pros: Myth Busted Schema Is Only Useful For Unstructured Data Schemas’ Indirect Impact on Ranking Schemas Ensure High Rankings: Myth & Facts List Of Schems That Not Supported By Google Anymore?