Sorry, Sir, we’re not going to index all your pages

While Google didn’t use that exact phrase, that exactly how I imagined them saying it, especially in a British accent.

Bet you just tried it, didn’t you?

John Muller was recently asked on twitter why they had only indexed 30,000 pages of a site and how long would it take to index the rest of it. 

They'll possibly never be indexed. It's easy to create millions of pages without useful, unique, compelling, or high-quality content. That's not what we'd want our systems to spend a lot of time on.— 🍌 John 🍌 (@JohnMu) https://twitter.com/JohnMu/status/1235521073262792704?ref_src=twsrc%5Etfw">March 5, 2020
sorry sir were not going to index all your pages

The site has over 800,000 pages – so, you’d expect it wouldn’t happen overnight.

But, to make things a little clearer, before the pages get indexed they need to be crawled, and, the vast majority of sites out there would be unlikely to have enough crawl budget for Google to crawl this amount of pages.

Sites like the BBC, Wikipedia or Washington Post will likely have over 800,000 pages indexed but these pages have been added over years and all add value – whether you rate the sites yourself or not.

Turns out this guy’s site was an aggregator site. So in Google’s eyes, it probably doesn’t add a lot of value to its users – so, therefore, they’re unlikely to index them all – sad, but true.

How to get 800,000 pages indexed

How to get 800,000 pages indexed

If at any stage you’re going through the checklist below and can’t answer 100% honestly YES then it’s unlikely to be indexed.

Firstly, your pages would have to be valuable unique content that Google values and wants to show its users – do they?

Do these pages load quickly and are they easy for Google to crawl?

Are people naturally sharing these pages on social and other sites? (Google really wants to index popular pages)

Is the site an authority? (The more authority a site has, the more crawl budget it will get).

Are all these pages added in bulk or over time? This is less important but if pages are added over years like a news site they are more likely to be crawled and indexed rather than a site created yesterday that suddenly has 800,000 pages – I’d bet any content creator that they couldn’t come up with that much great content in that short space of time!

While the steps alone won’t guarantee all 800,000 pages being indexed, if at any stage you can’t answer yes to the questions above, then they might never be indexed. 

Even for sites like the BBC or Wikipedia, or other sites that Google really trusts, I wouldn’t expect every page to be in the index. 

Quality over quantity is what matters – every time. 

Share:

Share on facebook
Facebook
Share on twitter
Twitter
Share on pinterest
Pinterest
Share on linkedin
LinkedIn

Leave a Comment

Your email address will not be published. Required fields are marked *

On Key

More Great Articles

changes to robots file
Crawling
Andy

Changes to Robots File

Google has been making some big announcements in the recent week about the robots.txt file but this one if one of the biggest announcements. If

Read More »

Subscribe to my newsletter

Want to learn more about technical SEO and Server Log Analysis, sign up below and I will teach you.Each week I will send you the most helpful articles delivered straight into your inbox as well as sharing some very useful tips and tricks to improve your technical SEO skills to Ninja level.

Subscribe to our mailing list

* indicates required