If you have a development team to handles your servers and website, then you are probably going to need to ask them to get you the data, it’s much simpler than getting the data yourself, however, you need to know what to ask for. Here are the essential items you need to request.
So you want to do some server log analysis, but you need to ask your developer for the data.
The basic information you need to make sure they are getting is:
Host:
Date:
Page/Title:
Bandwidth:
Response Codes:
Referral:
User Agent:
This is the bare minimum needed to do any type of log Analysis. An example of what the data looks like:
Host 220.181.108.183 (Beijing,Beijing,China)
Date 31/07/2015 13:32
Page/File /what-to-check-before-buying-a-domain-name/
Bandwidth 22460
Response Code 200 – OK
Referrers No Referrer
User Agent Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
However this should be the basic information you should be asking for, ask your development team what they actually record and see if any of that information can be useful.
One of the benefits of asking the developer to get you the data is you can request the file format (I.e. excel) saving you the trouble of trying to convert it yourself.
also if large excel files scare you – just ask them for the specific data, Google not activity which triggered a non 200 response code.
If you analyze the file in Excel, I’d suggest asking for the file in either .log(standard Apache) or .csv. Sites with high traffic can easily span over million rows which is pretty much what Excel can show.
If you do not have a proper log analyzer, use Grep to split the file in smaller chunks for e.g. all 404s. Sublime text editor is highly recommended. It can open log/csv files over 1GB with relative ease.