What’s a Crawl Budget and How To Optimise It

What’s a Crawl Budget and How To Optimise It

Anna Kowalska

Even for the most experienced developer, it’s easy to dismiss crawl bugdet and its impact in favour of more ‘important’ activities. Sure, there is plenty of other SEO practices which require more immediate attention but that’s not to say that this should be overlooked.

It may sound like a complex marketing budget but the misleading term actually means something very different. Crawl budget refers to the number of times that a search engine bot crawls your website every day.

We’re all familiar with the common bots such as Googlebot and Bingbot which use ‘spiders’ to crawl web pages and collect unique information. This data is then indexed and used to rank a given site. But frequent checks are required as new pages are constantly being added which require further indexing.

Let’s say your site is crawled by Google approximately 25 times per day, your average crawl budget per month would be 750 per month. To find your approximate crawl budget, simply sign into to Google Search Console, or the equivalent, and search ‘Crawl’ followed by ‘Crawl Stats’ which shows the number of pages crawled per day.

The number of times a page is crawled can change massively from one site to the next. This depends on several factors, some of which can be influenced to optimise crawling on your site. Clearly this is advantageous to ensure you newest content or updates are indexed quicker.

Optimising your crawl budget

Making your site as ‘crawlable’ as possible is the key to encouraging as many daily crawls as possible. For many this is simply a case of configuring the .htaccess and robots.txt. to ensure they aren’t blocking any critical pages.

The opposite can be said for those pages that you want to prevent from being indexed. In this case, rather than just setting the Robots.txt to ‘Disavow’ be sure to manually block the page using the noindex meta tag in the <head> section of the page.

The use of excess media files on pages can also confuse many of the search engine bots, with the exception of Google. Although there have been great advances for the indexing of JavaScript, Flash and HTML, it still may reduce crawl budget with lesser established search engines. That said, it’s worth limiting rich media files on those pages where ranking is essential.

Cleaning up your XML sitemap is another way to make the job of the crawl bots easier. Regularly update this to remove useless redirects, non-canonical and blocked pages. Utilising website audit software is an effective and quick way to create the cleanest sitemap that removes any pages blocked from indexing.

In addition, you’ll want to avoid redirect chains where possible as this is a common waste of crawl budget. If your website has a series of 301 or 302 redirects, the crawlers may take an alternative route before arriving at the destination page, meaning it won’t be indexed. Never include more than two redirects in a row.

Finally, as with general SEO best practice it’s essential to remove any broken links as it could impact crawl budget. With more emphasis on user experience, those sites with a large number of error pages won’t be categorised and re-checked as quickly as those with very few.

The importance of making a website index friendly can’t be underestimated. Thankfully, that’s not only helpful for crawl budget but an integral part of SEO. The synergy between these two actions brings about more frequent crawls meaning faster updates when you add content. As such, implementing these new techniques for optimising crawl budgets is a win-win. Don’t let the haters tell you otherwise!