We recently did some technical SEO analysis of the top 100 wedding blogs.


Hi! I’m Corey, the Server Log kid. Today, we are looking at the wedding blogs. That’s correct, Corey.

Today we are going to be looking at the top wedding blogs and doing some technical analysis on them.

If you’ve got any questions, fire away, ask them, if you can get them through. If not, we’ll answer them at the end. So yeah, let’s jump straight into it.

So, the top wedding blogs from technical SEO analysis. Just quickly, so we can explain why we’re doing this; we’re a technical service log auditing agency.

We don’t do anything apart from that; we don’t do link building, we don’t do content. We have a wide range of clients, from in-house clients to other agencies doing white label reporting.

So this report is an analysis on looking at technical, cause that’s all we do. We’re just the technical specialists.

So, the data. We’ve taken, Hitwise, top one hundred traffic to the wedding blogs category.

The other blogs can be based anywhere, but the traffic is the UK traffic, so it’s the top 100 sites that get the most UK traffic. What have we analysed?

Like I said, we are a technical SEO agency, so all we’ve done is look at the on-page and technical elements of it.

We haven’t looked at links because that’s not our speciality, and we haven’t looked at the content.

Again, it’s not our strongest area. But crucially, we’ve only looked at what’s publicly available information.

None of these are our clients, so we don’t have access to Google search console; we don’t have access to anything like analytics, all the server logs, so this is purely just what we can publicly see.

So, let’s get into the analysis. So first of all, we looked at how many of them are set up correctly.

There are four ways you can fundamentally set up a website.

You can set up as HTTPS colon domain, HTTPS subdomain, like www.domain; HTTP domain, and HTTP www.domain. Y

ou should really pick one of these and stick to it; our preference is going to HTTPS forward slash domain. You can use subdomain, there’s no difference.

All sites now should be, it’s 2018, should be using HTTPS; there’s no real reason not to. Google claims you get an extra little benefit, a minor one from being secure.

We’ve not really seen it, but we have seen that, with the release of recent Chrome, you do get a not secure warning if your site isn’t using HTTPS. So really, Google treats all these four as separate sites.

The idea is you pick one, and 301 the other versions to the main property.

The problem is if you go for HTTP and 301 an HTTPS to it, which looking at the data some people have done, is that the browser also remembers the redirect.

So, let’s say you are on HTTP and you currently redirect HTTPS to it, after a while, you decide you wanna go secure because of all the benefits, and some of the benefits we are about to discuss, so you undo the redirect and put the redirect from HTTP to HTTPS.

That’s great, Google will pick up on that pretty soon, maybe the next time, the same day, a couple days later.

Any new users to your site will note indifferences its only returning visitors, what will happen is they will come in on HTTPS, their Chrome, Firefox, whatever browser they’re using will remember that there is a redirect in place and the browser will redirect them to HTTP, your server will then kick in cause your obviously set up this redirect correctly and redirect them back to HTTPS, and then browser will kick in so you’re in a redirect chain.

If you are currently in this state where you redirect HTTPS to HTTP, get in touch. I don’t really wanna cover it here but there are ways to get around it, but it will take time. And before you say, “No one’s daft enough, ASOS, one of the UK’s biggest shopping sites, currently does that to its main domain, so HTTP forward slash ASOS.com is the main site. HTTPS 301’s to it. It was the last time I checked, anyway.

So, let’s look at the stats. So, out of the top 100 sites, we analyzed 63 percent or 63 of the sites was set up correctly. 37 percent of the sites weren’t set up correctly.

There are multiple reasons for this. Some of them were produced only doing 302 redirects not 301. Some of them were having the HTTP and HTTPS return to Google response codes so Google seemed to hit content. Some were returning 404 errors, I think there were only one or two that were actually returning 404.

Some would be having one version in 301 in others to it but then there’d be a 302. Many reasons, but 63 percent of the sites are set up correctly, just 37 percent of them aren’t.

So between those 37 percent it’s pretty easy to check them all, just type in all four variations into your browser and you’ll see what happens.

So we’ve moved on to speed. We all know speed is super important.

Page speed is super important, Google’s been on about this for months. They said the July update was coming which was speed-related. Whether it was speed-related or not is to be determined.

But it helps with SEO, PPC, UX Conversions. Any quick wins you can do with speed is super important.

If you are on HTTPS you can get the benefits of going to HTTP2, and I’m not going to pull out your here explaining how to do it or why you should do it, just watch the server log kid he’s done a few videos now on HTTP2 and the benefit. But there’s a site down here that you chuck in the URL and it tells you wherever you are if HTTP2.

When we’re speaking with clients and such we say the benefit you’ll get is your page speed will half so you’ll have 50 percent reduction in page load time.

If I’m being completely honest, it’s usually between 60 and 70 percent reduction but we like to keep client expectations low. Safe bet is your page speed will drop by half.

So it’s gonna take six seconds to load it’ll load in three which is a huge difference and users will notice the difference.

Out of them sites, 56 percent was set up correctly for HTTP2, but 44 percent wasn’t. Now a small part of the 44 percent is because they’re on HTTP and they need to be on HTTPS for it to work. So them obviously must be excluded.

Others have done all the hard work, done all the ranking fluctuations by moving to HTTPS but they haven’t done the final bit. Unlike moving to HTTPS, upgrading to HTTP2 doesn’t cause any ranking fluctuations, it’s just server side it’s about a five to 10 minute job and you’ll start seeing the benefits straightaway.

You might not see the rankings boost overnight but your users will see the boost right away and there’ll be no detrimental impact to your rankings.

So yeah, of the 44 percent we’ll exclude about five to six percent which was because they’re still on HTTP, the other segment a good 30-35 percent which is just because they’re still on HTTPS and not set up, so that’s one quick win. Like I said, just go back to this URL here, type in your domain and it’ll tell you whether you’re server’s set up or not.

And then in all cases speaking to your dev, your hosting provider and saying,

“Hey, we’re on HTTPS, we wanna move to HTTP2, can you do it for us?”

One of the most important files on a website is the robots file. It obviously tells robots what they can or cannot do on your website.

So we analyzed whether the page was there, not whether it was set up or how detailed it was set up, just whether the page was there. Everyone should have a robots file.

We thought this would come back 100 percent and we’d have a nice little graphic, 100 percent.

Nope. Six percent, or six sites, in essence, didn’t have a robots file. And this is pretty easy to test, just type in your domain forward slash “robots” dot t-x-t. It all needs to be lowercase, it is case sensitive and it needs to be spelled correctly for it to work, otherwise Google won’t count it or other search engine bots will not be able to find it.

But yeah, six sites don’t have a robots file.

It’s pretty easy to do, there are videos out there on the web.

I think we’ve done a video before, “How to create a robots.txt file on WordPress sites.” It’s pretty easy and then it’s just a case of editing it to meet your certain needs, but yeah, every site should have a robots dot t-x-t file.

Then we looked at the sitemap. So, there are three ways Google’s gonna find a page on your site.

One, crawl your entire site.

Two, follow external links to your website.

Three, read your sitemap and find your pages, cause in your sitemap it tells them the new URLs, and it tells them the date the URLs were updated, the priority they should be crawling, and the frequency.

So sitemaps are pretty important. I can tell you from looking at enough server logs over the years that Google does not crawl every single page on every single site straightaway. It can take them a while before they find new pages.

And, let’s be honest, if you just put a new blog article out there, the chances of you getting links to it straight away, unless you’re super brilliant at link building, are pretty slim.

So, a sitemap is crucial for helping Google to find new pages on your site. 18 percent of sites didn’t have one.

The number could be slightly out because, unlike the robots file, you can call your sitemap URL anything you wanted.

We guessed, first of all, to see which standard sitemaps, sitemaps.xml, et cetera.

We then looked in robots.txt file, cause that’s another great place to locate the sitemap.

For these 18 sites we couldn’t find them. They could have them, potentially, they could have them as something weird and unique and then just telling Google that in the search console what it is, that could be the case.

But more than likely they don’t have it. So, say 15-18 sites don’t have a sitemap, which is a crucial file.

It helps Google find new pages on your site. So then we got in to crawl their website. We used a tool called Screaming Frog.

There are many other tools out there, Screaming Frog’s a desktop-based one, we don’t really wanna spend much time talking about what Screaming Frog is.

We’ve crawled it numerous times, means their lucky day. Support, crucially we have your Screaming Frog.

There’s a free version and the paid-for version is 149, I know that because I was renewed last week and it lasts for a year.

But yeah, we crawled the key 500 pages of top 25 sites.

And the reasons that we’ve got asterisks so is because some of the sites actually didn’t like Screaming Frog to crawl those sites.

And yes, I could have got around this, but I thought I’d honor so if site six won’t allow us, we just move down and keep going.

These are the top 25 sites that would allow us to crawl openly with Screaming Frog. Crucially, we only crawled the top 500 pages, and some of these sites are three, four, five, even 10 thousand pages deeps, so some of these are big sites. So we did miss a lot in that.

The first thing we looked at, was 301 and 302 errors. We all know that these are super important and you really shouldn’t be having any issues on your website.

It’s one part that you control, you can easily update. Three percent of the pages crawled had a 301 error. 21 sites had at least one error, and 11 sites had at least one 302 error. So two percent of all pages crawled.

Three and two percent doesn’t sound a lot, but you’ve gotta think we only crawled the top 500 pages of these sites, the most pages, the more likely to be errors on all the blog articles and all the blog posts, et cetera. So, the further down, if we’d crawled the entire sites we probably would have seen more data and more numbers, but we only crawled the top 500 pages. 

And them numbers should have been at zero. You control these web properties, they’re pretty easy to update.

So yeah, you should self crawl your own website, download it, find the errors, fix them. Next, we looked at 404s.

And again, only two sites had at least one 404 error. But crucially, we only crawled the top 500 pages.

These tend to sit on all blog articles, whether they linking off to something you created three, four, five years ago. I wasn’t surprised to see a low number.

I would expect there to be more, but they’ll just be further down the site in more lower quality pages or older blog articles, not necessarily lower quality but older pages than we didn’t crawl.

So there will be more, but two sites still had at least one 404, which surprised me. One of the downsides of Screaming Frog is that it’s not very visual.

It’s just an Excel dump, basically, once it’s crawled the site. We created this free report in March, April time, which you can use if you wanted to make it a little more pretty. The URL’s at the bottom here.

Basically what you do is crawl the site, chucking into these little sheets linking to DataStudio. What else? So we’ve only crawled the top things here.

Like I said, if you’ve got any questions then ask away, we can see that there are some people watching this.

The video will be available afterwards, so if there are any questions that we haven’t covered or that you think when you’re watching the replay, just fire them across on email to us. What else should you be looking for? Large images, if you’re using Screaming Frog you can filter by image size, find the largest images on your site, pretty easy fix. Just chuck them into something like Compress JPEG or Compress PNG.com compress them then re-upload them.

Google has two-page read tools you should be using and analyzing trying to find that what other bottlenecks there are on your website. Obviously HTTP2 will reduce it by roughly 50 percent.

Compress your images is another big thing but there are probably other things you use in them.

Remove unused Plugins, too easy, should do this. One, they slow down your site and second of all, it’s a security risk so more Plugins you have, more chance you have of hacker getting into your site, et cetera. Same with unused themes on your site.

Only keep the one you’re using, all the rest just remove.

A few takeaways from today’s quick webinar is, pick a preferred set up, HTTPS preferably, either with a subdomain or without a subdomain, 301 all the others to it, but you do need to set it all up with Google Webmaster Tools, or Google Search Console, that it’s now called.

Check your site wherever it’s HTTP2, yes or no, if yes, great you’ve done the work. If no, speak to your developer or hoster.

Make sure your robots file is live, firstly, and then once you’ve got it live and you hack-proof it, then start tweaking it making it most optimized to your needs.

Then go on and check your sitemap is working, is it there, yes or no. If no, speak again. If you’re using something like WordPress, there are a few Plugins you might install to create a sitemap on the fly for you. I think Yoast does it as well, but I could be wrong.

Then crawl your website using a tool like Screaming Frog. There are others we use a thing called DeepCrawl but that’s cloud-based, so you pay per URL crawled, unlike Screaming Frog.

Then crawl your site and fix errors. Anybody got any questions?

There doesn’t seem to be any coming in but I’ll check in a minute on Facebook. I’ll do that now. If not, you can email me afterwards to [email protected] and I’ll more than happily answer any questions. Let me just go across to Facebook and just check it’s a new tool I’m using.


Share on facebook
Share on twitter
Share on pinterest
Share on linkedin

Leave a Comment

Your email address will not be published. Required fields are marked *

On Key

More Great Articles

SEMRush Bot

SEMRush Bot

SEMRush is a tool which you can use to crawl your website to analyse to find technical issues with what’s wrong on the website. SEMRush

Read More »
Batchspeed Review
Speed kills

BatchSpeed – Review

Ever been like me and wanting to bulk analyse page speeds for websites, but hated the fact its quite a manual task. Well, I’ve got

Read More »

Subscribe to my newsletter

Want to learn more about technical SEO and Server Log Analysis, sign up below and I will teach you.

Each week I will send you the most helpful articles delivered straight into your inbox as well as sharing some very useful tips and tricks to improve your technical SEO skills to Ninja level.

* indicates required