Nov 24 2008

Google Analytics – oh, the controversy

Published by at 12:53 am under analytics,resources

This past week was Google Analytics week for some reason. It was in the news and all over the blogs. Controversy *and* good stuff. I, of course, am going to focus on the controversy for this post. Nothing new here, but worth mentioning.

Article Number One. First, read The Disturbing Truth about Google Analytics on iMediaConnection. The post charges that much of the way Google Analytics collects its data is inaccurate and therefore the calculated metrics (i.e., time on site) are inaccurate. The comments were generally negative. Brian Clifton, author of Advanced Web Metrics with Google Analytics, even said the post was incorrect. Very interesting post and comments overall.
Dainow asserts that the way GA measures visits is wrong because it is including bounces within this metric. I’m confused. Isn’t a visit a visit regardless if it’s also a bounce? According to the Web Analytics Association’s standards it is. I’d want to count a visit whether it’s 1 page or 14 pages. The bounces are just your single page visits.

[Post Update]
Thanks to Brian Clifton for clarifying what Dainow was asserting. From Brian’s comment below:

He [Dainow] is stating that for metrics such as time on page and time on site, that GA includes single page visits. The unwritten standard in the field of web analytics is to exclude single page (bounced) visits and the last page of a visit for such calculations …

… So most web analytics tools omit the last page visited for these types of calculations. And if the visit is only one page then then the entire session is omitted. A single page visit is still tracked as a visitor, just excluded from time metrics.

So he was *not* stating that bounces should not be counted as a visit. Whew … I’m glad for that!
[End Post Update]

I’ve given my thoughts on bounce rate before (depending upon the type of site, counting bounce rate by time might make more sense), but not counting single page visits (bounces) in with total visits doesn’t make sense to me.

Article Number Two. Another interesting article to read is Google Analytics — Yes, it is a security risk on TheRegister. With the Obama website hack as a backdrop, the basis of this article is that Google Analytics is a huge security risk for any website. I’m not sure I agree with that. Yes, any time you have a third party javascript, you’re at the mercy of that company to refrain from doing evil with it, but that’s true with any business relationship. When you pay your invoice for any company, when you go to the mall and hand over your credit card, or when you buy that book through Amazon – you’re always at the mercy of that company.

A couple points to consider – on secure pages, use the secure code. Don’t put the code on your admin pages (for those of you using a blog or some kind of content management system).

9 responses so far

9 Responses to “Google Analytics – oh, the controversy”

  1. Brian Cliftonon 24 Nov 2008 at 4:00 pm

    Hello Shelby – yes, the article by Brandt Dainow is factually incorrect. I wish he had reached out to those with a stronger connection to Google Analytics before taking such a strong (inaccurate) stance. Unfortunately, these days writing those type of headlines is a sure fire way of generating traffic, which was probably the intention…

    To clarify what he is claiming, he is stating that for metrics such as time on page and time on site, that GA includes single page visits. The unwritten standard in the field of web analytics is to exclude single page (bounced) visits and the last page of a visit for such calculations.

    The reason is that time spent on the last page is so inaccurate to measure. For example, say I visit two pages on your web site each lasting one minute, then I get distracted on the third page and leave my browser open. GA will close my session from a tracking point of view after 30 mins of inactivity. That is also an unwritten industry standard and can be adjusted if required.

    Calculating the time on site for my visit will be horribly skewed if page 3 is included:
    60+60+1800 = 1920 seconds

    The reality is better calculated by omitting the last very inaccurate measurement:
    60+60 = 120 seconds

    So most web analytics tools omit the last page visited for these types of calculations. And if the visit is only one page then then the entire session is omitted. A single page visit is still tracked as a visitor, just excluded from time metrics.

    Clearly that still is not 100% accurate, but it is a much better situation than including it. Please do have a look at the accuracy whitepaper I wrote on this subject as I would be interested in your comments: http://www.advanced-web-metrics.com/accuracy-whitepaper

    Hope that makes sense.

    Best regards, Brian

  2. Shelby Thayeron 24 Nov 2008 at 5:16 pm

    Brian – Thanks so much for your comment. It absolutely makes sense. I appreciate you clarifying what Brandt was stating. That actually makes me feel better. Even, though his statement was factually wrong (which most of the readers who commented on his post stated, thank goodness!), I’m glad he wasn’t recommending that bounces shouldn’t count as visits. [I will update my post to add your comments.]

    I am aware of the time-on-site discrepancy and am looking to use the method that Avanish mentions in his time-on-site post (page unload event). I know this method is not GA-specific, but I’m wondering if you touch upon it in your book. I just bought it and am excited to start using it.

    I actually just downloaded your whitepaper over the weekend (when I read Brandt’s post). I haven’t read through the whole thing yet, but I will definitely give my comments on your blog.

    Thanks again. I appreciate your time.

  3. Brian Cliftonon 25 Nov 2008 at 3:34 am

    Shelby – that sounds like an interesting A/B test – time metrics without the last page included v those with the last page included. Are you able to compare two different sections of your site at the same time (to limit seasonal fluctuations)?

    I would certainly be interested in the outcome of that. Nothing like that covered in the book unfortunately :(

    Best regards, Brian

  4. [...] Google Analytics – Oh, the controversy By Shelby Thayer This past week was Google Analytics week for some reason. It was in the news and all over the blogs. Controversy *and* good stuff. I, of course, am going to focus on the controversy for this post. Nothing new here, but worth mentioning. … [...]

  5. Brandt Dainowon 10 Dec 2008 at 7:36 pm

    While Brian states I am factually incorrect, I have an email direct from Google themselves confirming my statements, and explaining why they did this. It is easy for people to say how Google COULD do it, but if you want to accuse me of inaccuracy, you need to provide proof. I have proof that I am correct from Google itself. I could also quote the original email conversation I had with Google Analytics support when they did it, but I think the more recent email says it all. The following is from Elisabeth Diana, Global Communications & Public Affairs, Google, Inc.:

    “I wanted to follow up with you over email: regarding your piece in iMedia, you bring up some points that we think we can provide further insight into. Additionally, one of our latest features could help you address your concerns around time-on-site metrics.

    In terms of our methodology, we really tried to listen to what the majority of our users were saying, because they are the ones who live and breathe the tool. The GA team had heard different opinions about average time-on-site and visit calculations from our users, and while there were some different schools of thought on the issue, most GA users who provided feedback wanted bounces included in the visits and time on site metrics.

    If users prefer the time-on-site and overall visit calculations you outline in your article, we now have a way to address their preferences. We wanted to point you to our latest feature, Advanced Segmentation. Creating a segment that includes page depth greater than 1 (or greater than or equal to 2: eg. Time on Site & Visits & Conversions excluding Bounces) will yield site-wide averages and visit numbers that are in alignment with your ideas.

    I hope this has helped to give you further insight into how we gather methodology feedback from our users, and if you have any questions regarding segment creation (or any other questions, for that matter), please don’t hesitate to use us as a resource for information moving forward.

    Thank you,
    Elisabeth

    Elisabeth Diana | Global Communications & Public Affairs | Google, Inc.”

    So – if I’m wrong, why does the data match my assessment? And why did Google confirm my statements? If you still think I’m wrong, Brian, prove it.

  6. Shelby Thayeron 14 Dec 2008 at 10:26 pm

    Brandt – thank you so much for the comment. I guess at least now with advanced segmentation it’s easy to work around.

    My big issue with *all* analytics vendors is that they don’t disclose issues like this up front. No matter if the analytics tool includes or excludes bounces in time on site calculations, the calculation isn’t correct anyway (because it drops the last page). That just makes it *more* inaccurate.

    Maybe I’m a little naive here, but I don’t understand why they don’t disclose this issue and provide a workaround (like onunload in the js) up front.

    I know we all understand that these numbers aren’t precise and we need to be looking at trends, but if I run a blog and Ts is one of my KPIs, that could be a huge issue.

  7. Brian Cliftonon 13 Jan 2009 at 8:39 pm

    replying to Brandt and then switching off from this discussion (sorry but its started to get a little silly…)

    As others have pointed out, the Brandt methodology looks good. However, your headline “The disturbing inaccuracy behind Google Analytics” is not true and sensationalist at best.

    The simple fact is that GA calculates time on site as described by Avinash here: http://www.kaushik.net/avinash/2008/01/standard-metrics-revisited-time-on-page-and-time-on-site.html

    That was correct at the time he wrote it (Jan 2008), was correct as far I was concerned before that, and remains correct today. I cannot say if it changed or not in 200x but is that relevant? Vendors update their products all the time…

    How do I know this?

    Well, I would describe myself as fairly technical having begun my Internet career as a web developer (perl, php, javascript).
    My company was the first UK partner for Urchin (the product/company Google acquired in 2005 to become GA) and we remain a leading Google Analytics Authorised Consultant.
    I was Head of Web Analytics for Google EMEA (2005-8) and I wrote the book Advanced Web Metrics with Google Analytics.

    So I am saying I have the connections, know-how and experience to back up Avinash’s post, who of course is now part of the GA team.

    As an aside, this is another article written by Brandt Dainow in 2007. Its very odd when you compare it to the content of his latest one…
    http://www.imediaconnection.com//content//15823.asp

  8. Brandt Dainowon 14 Jan 2009 at 9:53 am

    Well, this is getting pointless. Google say explicitly I am correct, and Brian says I’m wrong and he knows this because he used to work for Google (though not in a technical capacity).

    The answer’s simple – if you’re correct, Brian, show me the code.

  9. Brian Cliftonon 14 Jan 2009 at 10:40 am

    Brandt – I don’t come to the same conclusion as you to Elisabeth’s email (Google’s PR dept.). There is no mention of how time-on-site is calculated. Its a polite email saying thank you for your input and we have listened to feedback.

    I have no interest in proving the point any further. However, for further peer review, you may want to add your thoughts to the original post on this by Anivash.