Kiwiblip!

I was just running through this sites numbers for the last weeks for the first real month of winter *. Then as usual I went off to have a look at how the other political sites are going.

And wow!

Either Kiwiblog had a very large social media blip starting on the 19th, going into the 20th, and still running today in the early hours of the 21st. Or  something is really wrong with Sitemeter, or there is a irritating bug somewhere bloating the numbers. I’d like to know one way or another, as I’ve had to deal with some strange spikes like this in the past.

Looking at the rapid increase in the ratio of page views to “visits”, I’d expect it is a bug.

Normally when you get a real link flood from people coming in from reddit or  digg or facebook or twitter or any of the international services, you’d expect the pages per visit to drop. People come in to have a look at the page that has been linked to and they don’t bother much with the rest of the site.

Kiwiblog in sitemeter usually consistently runs at a long term variation of around 1.3-2.0 pages per visit. In this part of the year and election cycle, the ratio goes up to 1.8 for weekend traffic when there are fewer casual readers and down to 1.5 . During election month  it will get over 2.0 on week days. During January it will approach 1.1 because there isn’t that much interest in holiday snaps even amongst the sewer-rats. Getting a 2.6 ratio and a million page views in a day (about 50 times the usual) tends to make me suspect that this is a glitch. Lots of casual reading of lots of pages? Not likely.

Besides, I can’t see any post or comment in the last week that would cause a major influx (No doubt David Farrar will tell us if there was).

But I’ve seen something before that was a bug and looks like this.

Back in 2012 2011 I changed this site to use the asynchronous Facebook javascript rather than the synchronous one.

But there was some kind of bug on the facebook site with their caching. Rather than Facebook storing the link, excerpt and graphic for a ‘liked’ link on our site like it was meant to, Facebook would request the whole of the same page every time that someone scrolled past the link on Facebook. As people tend to open and scroll down their Facebook page a lot, there were a lot of requests on the site.

What was worse was that Facebook appeared to grabbing the whole page and executing the javascript at the end of the page that said to StatCounter and SiteMeter that a human had read the page, and it was doing it from the end-users browser.

It took me a little while and a chunk of after work analysis to figure out why I was getting five times as many page views per post on some posts. The IPs were all different and in our usual IP ranges (ie mostly NZ) so it didn’t look like a spambot attack. At least not unless someone had made botnet zombies out of a awful lot of kiwi computers (and I was worrying about that scenario for a few hours).

After a few days I realised where the excess pageviews were coming from after I filtered and sorted the millions of lines of website log looking for the common patterns. There is a lot of chaff in website log file because it logs ‘hits’ to every image file, CSS file, javascript file, async jquery, and a lot of internal stuff. A lot of the time it doesn’t even do anything much work with those. It just returns the machine version of “you already have that file, use that” for client side caching. The actual dynamic pages of a commented website are a smallish fraction of the lines in the web log that you have to filter for.

Eventually, I figured out that the common feature for the excess page views was Facebook. Confirmed it by looking just at my own machine talking to Facebook and seeing the effect on the site. So I turned off the asynchronous facebook and our numbers went back down to normal.

A few months later, I tested it and it repeated that same pattern. The third time a few months later it worked (with no changes in my code) the way that I expected it to.

My guess is that DPF has had something similar happen to his site. It’d be interesting to see what shows up in google analytics (I had some significiant variations in page views between sitemeter, statcounter, and analytics), and how much effect it had on the server. With my interesting Facebook bug, the reason I noticed it was because I was getting warnings that the CPU on the server was getting close to maxing out, and readers were complaining that the site was running really slow.

But it really does show you how easy it is to get interesting statistics in the web world. Some people deliberately cultivate this kind of bug, or even induce it.


 

 * The winter cycle… We usually peak up to April or May, and then have significiant drop in page views over June and July. Then start pickup at the end of August. You’d think that winter was when people had long evenings to while away on blogs. But that isn’t what happens.


Updated: I noticed an error in my post. The facebook issues were in 2011 not 2012. See the post about how close we were getting to Kiwiblogs page views. That post probably made DPF realise that his site wasn’t collecting page views from the post pages.

Powered by WPtouch Mobile Suite for WordPress