SEO Guide: How do HTTP status codes, network, and DNS issues affect Google Search?

SEO Guide: How do HTTP status codes, network, and DNS issues affect Google Search?

Google Search is affected by various HTTP status codes, network problems, and DNS errors. Here, we shall go through the top 20 status codes Googlebot comes across on the web, as well as the most common network and DNS problems.

Exotic status codes like 418 (I’m a teapot) aren’t included. In Search Console’s Crawl Stats report, all of the issues described on this page result in a corresponding error or warning.

HTTP status codes

When the server that hosts the site replies to a request from a client, such as a browser or a crawler, it generates HTTP status codes. Although each HTTP status code has a unique meaning, the request’s conclusion is frequently the same. There are other status codes that indicate redirection, for example, but the result is the same.

For status codes in the 4xx–5xx range, as well as unsuccessful redirections, Search Console provides error messages (3xx). The material received in the response may be evaluated for indexing if the server returned a 2xx status code.

An HTTP 2xx (success) status code doesn't guarantee indexing. 

2xx (success)

200 (success) 
201 (created)

The material is taken into account by Google while indexing. Search Console will display a soft 404 error if the content implies an error, such as an empty page or an error message.

202 (accepted) 

The material is sent to the indexing pipeline by Googlebot. The material may be indexed by indexing systems, but this is not guaranteed.

Googlebot waits for material for a set amount of time before passing it on to the indexing pipeline. The timeout is determined by the user agent; for example, the timeout for Googlebot Smartphone may differ from that of Googlebot Image.

204 (no content)

Googlebot informs the indexing pipeline that no material has been received. The site’s Index Coverage report in Search Console may indicate a soft 404 error.

3xx (redirects)

Up to ten reroute hops are followed by Googlebot. Search Console will display a redirect error in the site’s Index Coverage report if the crawler does not get content within 10 hops. Googlebot’s hop count is determined on the user agent; for example, Googlebot Smartphone may have a different value than Googlebot Image.

In case of robots.txt, Google follows at least five redirect hops as defined by RFC 1945 and then stops and treats it as a 404 for the robots.txt file.
301 (moved permanently)

The indexing pipeline sees the redirect as a strong indication that the redirect target should be canonical, and Googlebot follows the redirect.

302 (found)
303 (see other)

The indexing pipeline takes the redirect as a weak indication that the redirect target should be canonical, and Googlebot follows the redirect.

304 (not modified)

The indexing pipeline receives a signal from Googlebot indicating the content is the same as the previous time it was crawled. The URL’s signals may be recalculated by the indexing pipeline, but the status code has no influence on indexing.

307 (temporary redirect)

This is the same as 302.

308 (moved permanently)

It’s the same as 301.

While Google Search treats these status codes the same way, keep in mind that they’re semantically different. Use the status code that’s appropriate for the redirect so other clients (for example, e-readers, other search engines) may benefit from it.

4xx (client errors)

400 (bad request)
401 (unauthorized)
403 (forbidden)

If the URL was already indexed, the indexing pipeline removes it from the index. 404 pages that are encountered for the first time are not processed. The frequency of crawling reduces over time.

404 (not found)
410 (gone)
411 (length required)

Limiting the crawl rate using 401 and 403 status codes is not a good idea. Except for 429, the 4xx status codes have no influence on crawl pace. Learn how to set a crawl rate restriction.

429 (too many requests)

The 429 status code is interpreted by Googlebot as a server problem, indicating that the service is overloaded.

5xx (server errors)

Google’s crawlers occasionally slow down because of 5xx and 429 server failures. URLs that have already been indexed are kept in the index but ultimately removed.

If the robots.txt file returns a server error status code for more than 30 days, Google will use the last cached copy of the robots.txt. If unavailable, Google assumes that there are no crawl restrictions. 

Network and DNS errors

The visibility of a URL in Google Search is quickly impacted by network and DNS problems. Network timeouts, connection resets, and DNS failures are treated by Googlebot in the same way as 5xx server faults are. When a network issue occurs, crawling slows down quickly, as a network error indicates that the server may not be able to manage the serving demand. Unreachable URLs that have already been indexed by Google will be deleted from the index within days. For each mistake, Search Console may issue an error.

If you’re not hosting your website yourself, ask your hosting or CDN provider for help.

Debug network errors

Examine your firewall’s configuration and logs. There might be an extremely wide set of blocking rules.

Take a look at the traffic on the network. Capture and analyse TCP packets with tools like tcpdump and Wireshark, and look for abnormalities that lead to a specific network component or server module.

Contact your hosting company if you can’t discover anything odd.

Any server component that processes network traffic might be the source of the problem. Overloaded network interfaces, for example, may discard packets, resulting in timeouts (inability to establish a connection) and connection resets (RST packet sent because a port was mistakenly closed).

Debug DNS errors

Misconfiguration is the most prevalent cause of DNS problems. To troubleshoot DNS problems, try the following:

Take a look at your domain name system (DNS) records. Check to see that your A and CNAME entries are pointing to the correct IP addresses and hostnames. Consider the following scenario:

dig +nocmd example.com a +noall +answer
dig +nocmd www.example.com cname +noall +answer

Verify that all of your name servers are referring to your site’s correct IP addresses. Consider the following scenario:

dig +nocmd example.com ns +noall +answer
example.com.    86400  IN  NS  a.iana-servers.net.
example.com.    86400  IN  NS  b.iana-servers.net.
dig +nocmd @a.iana-servers.net example.com +noall +answer
example.com.    86400  IN  A  93.184.216.34
dig +nocmd @b.iana-servers.net example.com +noall +answer...

You may need to wait for your DNS modifications to propagate throughout the global DNS network if you’ve made changes to your DNS setup in the previous 72 hours. If you operate your own DNS server, make sure it’s updated and not overburdened.

Schemas Aren’t Solely for Tech Pros: Myth Busted Schema Is Only Useful For Unstructured Data Schemas’ Indirect Impact on Ranking Schemas Ensure High Rankings: Myth & Facts List Of Schems That Not Supported By Google Anymore?
Schemas Aren’t Solely for Tech Pros: Myth Busted Schema Is Only Useful For Unstructured Data Schemas’ Indirect Impact on Ranking Schemas Ensure High Rankings: Myth & Facts List Of Schems That Not Supported By Google Anymore?