Tech Note: LGF Pages Statistics Explained

Nine short months ago, Johnson put up a thread with this exact same title (littlegreenfootballs.com/article/37262).  In the interim, however, some things have changed, so The Boiler Room feels it’s appropriate to resurrect it and get our readers and the LGF lurkers back up to speed.

To start, we’re also going to borrow the visual aid from the thread:

1. The member “rating” counter has since moved to the bottom left of each page, and is just as meaningless as a popularity counter as the day these were implemented. Case in point, my Pam Geller page had a +13 rating…before Johnson rudely tossed it into the memory hole.

2. The counter that CJ and the lizards referred to as a “retweet” or “tweet” counter, has been fact-checked and revealed to…not actually count tweets (rather, “clicks”).  In fact, the counter is not connected to twitter at all. In comparison to the tweet counters on virtually every other blog and website (including the one here at DoD), the number displayed here is going to be grossly inflated.

3. The “clicks” counter has been suspected to be FUBAR for quite some time (in fact, even Killgore could see it).  It was recently removed altogether by CJ under the lame guise of being “outdated” and “not meaningful”.  We think it was out of embarrassment (many of the counters displayed “0” clicks, even weeks after being published).

4. The “views” counter was fact-checked by The Boiler Room and also proven to be fraudulent, both by the unique technique of the “front page effect“, as well as  in the way the custom-built counter reacted to visitors’ clicks. In combination, these factors result in what we estimate as an inflation of at least 4x over what a normal view counter would tabulate on a blog like ours.  However, the design is so convoluted and the methodology so disingenuous that we must admit that, after tracking threads, observing traffic patterns and watching these counters, the exact inflation factor is hard to pin down.  In any case, like the “tweets”, the values are worthless in comparison to every other blog.

Looking at it again, the only statistic in that screencap that isn’t a complete joke is the timestamp (although even that is screwed up by a few hours on the pre-’07  main page threads, as we understand it was a side-effect of the big SQL conversion).

And the other “counter” that isn’t visible there is the one that counts the Page’s comments, and that one appears to work (although the grand total counter for the LGF archives is still technically off by 32,531).

I guess the bottom line here is that CJ appears to be pretty awful at counting…anything.  But, hey, maybe the tech support at bit.ly will be just what the doctor ordered:


@lizardoid Ups #LGF Front Page Thread Count to 20

Rather interesting title, isn’t it? I mean, who cares, right? After all, the amount of articles a blogger chooses to display on their front page is one of those trivial, purely cosmetic decisions. Every blogger, that is, except the one who runs the green football blog.

To elaborate, I think all I need to do is paste in a series of screencaps and graphics, and I think it will be obvious why this move is particularly significant, and why we can feel free to point and mock one more time.

Per Johnson, from last year (note that CJ mentions 10 front page articles):

Here’s what happened when we watched the “views” counter for a thread over time, after the front page articles were upped to 12 earlier this year:

click to enlarge

And now, from just the other day:

Of course CJ didn’t explain it this way, but you don’t have to get into advanced mathematics to realize that, because of this “app”, the “views” for any given thread will increase exponentially as you add articles on the front page (all other things equal).

It’s kind of a simple, quiet and clever way to artificially bump those counters, especially since the explanation is buried in the comment section of some thread from last year, and for those who know about it, it’s difficult to explain in a single tweet.  In fact, if it wasn’t for The Boiler Room Crew, everyone would believe @lizardoid and his 217,000 claim as representative of -and comparable to- the way the “views” statistics are calculated on every other blog in the ‘sphere.  But as you can see from Johnson’s own explanation, it isn’t, and they aren’t.  It’s been a cheat since he implemented it, and by going to 20, he just turned up the dial (a lot).


Analysis: The LGF “Front PageView Effect”

Last week, we exposed an “error” in the way the custom-built LGF page view counter reacted to visitors’ clicks, and touched on a few other causes of page view inflation.  Since then, CJ appears to have corrected the IE problem that I demonstrated in our video (although we’re not convinced that this was the only “bug”), but what remains as the largest culprit to inflated thread page view numbers is the one in plain sight: the “front page effect” (fpe).

First, I’d like to say that we say “plain sight”, because CJ did explain exactly how it works and admitted that it would significantly increase the page view number that is displayed at the top of each thread.  So, while this explanation was buried in the comment section of an unrelated thread, we can’t claim that this trick was snuck in without telling anyone about it.  For the sake of thoroughness, here is CJ’s comment one more time:

Instead, The Boiler Room was naturally curious if there was a way to quantify this effect, and therefore get an idea on the level of bias it adds when comparing page view numbers to all the other websites which don’t employ this technique (and/or when it is used to trash tweet).  Additionally, this kind of data might come in handy if another blogger was thinking about doing something similar. What we found is that this isn’t that hard to do with some sampling and a little statistical analysis. 

For our analysis, here’s what we have to work with:

  • CJ has set 12 threads to display on LGF’s front page (at the time the fpe was announced, it was set at 10).
  • Each front page thread gets a “view” count when the front page is hit.
  • The view counters are observable.
  • Each thread is timestamped to the minute.

We’ve also got some smart and resourceful people here in The Boiler Room, and we can set things up so that the data can be gleaned from automated samples and fed into a database to be charted and graphed.  In short, we can track the reported view increases for any LGF thread from publication until it drops off the front page (and beyond).

What CJ may or may not have realized is that, with just those few things, we can actually get a pretty good idea of levels and patterns in LGF’s front page traffic by simply tracking what happens to these page view counters over time.   Apply a little math and logic, and we can separate the approximate fpe number from the “real” views by applying 2 rules (and these are key, so they deserve bolding):

1. The fpe # can never be greater than the lowest view increase amongst the 12 front page threads over the sample period (except in cases where a new thread is published in between samples and yields the lowest number).  In other words, the increase from the “deadest” thread on the front page contains the highest % of fpe views.

2. The greater the sampling frequency, the more accurate our estimate of the fpe becomes, and the % of fpe views in the increase approaches 100.

For 1, we can’t assume that the lowest view increase # is 100%  front page views, rather that it still may include a few other views that come from click-throughs, referrers, searches, etc., but we know that it will be the closest to the true fpe #.  But based on observation, and knowing generally what happens to views as a thread ages and moves down the front page of a blog, along with the fact that we have 12 threads to sample for the “deadest” and do so frequently, we can say that it’s going to be a very close estimate. 

For 2, we realize that we must balance the effect that our own samples have on the data, as every time we do it we register a front page view ourselves, so we wanted to limit our influence to only 1-2% if possible.  This balance was found taking samples a few times an hour.

So, there you have the methodology.  Take snapshots of the view increase of a thread, and each time subtract the increase of the “deadest” thread on the page, and what you’re left with is the increase that couldn’t have come from fpe (therefore, “real” views).  Make sense?

But, before we reveal the graph and the data, we should ask ourselves: Knowing about this fpe effect, what would we expect the page view counter increases for any given thread on a relatively popular community-style blog like LGF to look like, from the time it’s first published to where it later moves down (and eventually off) the front page? 

A: We’d expect it to increase very rapidly when first published, because in addition to the fpe, you have the lizards and lurkers who will click through to the comments, and the outside referrers (from twitter, other blogs etc.), and refreshing while the thread is “active”.  Then, as the thread ages and moves down the front page, we’d expect the increases to level off slightly, as the extra views from this thread activity dies off and you’re left with mostly fpe views increasing the counter steadily (with “waves”, as time of day will effect front page view rate) until it reaches the bottom of that front page.  Finally, we’d expect the increases to virtually flat-line the minute it is bumped off the front page and becomes thread #13, as it will no longer get fpe increases. 

And what would we expect a non-fpe counter to look like for the same thread? 

A: We should also see a steep increase at first (although not as steep, and not in the same quantities, obviously), and see that taper off as it becomes older and moves down the front page.  After the thread got to be a day old or about 4 spots down on the front page, the thread would essentially be dead for most commenting activity, but we should still see some increases from delayed lurker click throughs, lizards coming back to read comments they missed, searches, etc., and perhaps even a “bump” if/when it sees late hits from other sites.  It’s obviously going to vary a bit by the nature of the thread (for example, we wouldn’t expect an “open thread” to get late traffic from outside referrers, where others may get a lot more; so again, 12 to sample from helps), but for the most part, “real” page view increases should reduce themselves to a creeping pace with periodic bumps by the time the thread is a day old.

Well, we tracked and charted one, so what did we find?

Using a random thread that shall remain anonymous*, from the moment it was hatched to beyond the front page (the #s indicate the changes in its position on the page):

click to enlarge

The red line represents page views recorded from the counter.  Now, remember that with rule #1, the blue and green lines are estimates; it is much more difficult to pin down exactly.  Again, this blue line represents the lowest “real” views could possibly be, and the true line is undoubtedly a little higher for this particular thread (if another thread were sampled, we may see a blue line that is significantly closer to the green). But, since we believe that our methodology is sound, we can say that we’re darn close (to the point where you wouldn’t see much difference in the graph).

Alot of this is fairly intuitive, since the effect stipulates that these dead threads will keep accumulating “views” as long as they’re on the front page.  No one should surprised to see the view counters on these threads to show higher and higher numbers as you scroll down to #12, simply because those threads have been there longer.  So the effect is fairly clear to anyone who stops by LGF and takes a quick glance at all the view counters. 

In conclusion, the point of this exercise was not to prove beyond a doubt that thread 37xxx really got only x number “real” views (as most blogs count them), but to demonstrate the magnitude of the fpe inflation, and show that the technique renders the individual view counters meaningless.  Specifically, the “Front PageView Effect” puts so much weight on the counters that you can’t discern if one thread has a higher count than another because it was particularly insightful/important, or because of thread scheduling it just happened to sit on the front page longer.   That’s why normal blogs have a separate counter for “front page” views, and probably the biggest reason why a claim like this

is rather ridiculous, and deserves to be smacked down.

*the thread # is anonymous for IP security reasons

(Hat tip: The Boiler Room)


CJ’s Moment To Own Up

Two days ago, we published the Boiler Room fact-check on the “errors” in LGF’s PageView counter system. Since then, we’ve been linked to by a few other blogs, but so far the Grand Lizardoid has been eerily silent on the issue. How come? 

Well, our friends on the inside discovered that Charles did in fact address it, but like the totally honest and straightforward dude we’ve come to know and mock, did so from the (presumed) safety of an unannounced [private] thread:

“Faked”? Yes, somehow, I was able to get that counter to jump up 13 spots in less than 40 sec. on an ad-infested LGF, by loading it from another browser, while talking and scrolling and clicking on the visible link at the same time.  I have toes, after all.  But which lizard should try it per CJ’s request?   Well, our old friend Gus 802 stepped up to the plate:

Gus even tweeted it!  And by the looks of it, he was angry:

Am I busted? Could a “used car salesman” have possibly pulled off such an elaborate hoax? 

 Well, the next day, CJ quietly went back to the private thread, and left this:

Aw shucks!  Soo…no fabrications.  No fraud.  No bullshit. …No faked video.  I guess they’ll have to wait a while longer to catch us makin’ something up  (like that line about I.E. usage, ’cause I’m curious what “almost nothing” means).

Anyway…there.  We owned up on CJ’s behalf.  Lizards, you’re welcome.

Oh, and lizards, one more thing; when we start pointing out that there may be other, um, “bugs”, try to be cool about it, OK?  You don’t want to make these smackdowns too easy for us, right?

Update:  I have aquired the screencap of the comment that CJ was responding to, from Claire:

Update: Kudos to Claire, BTW, ’cause the timestamp revealed that it wasn’t until the next day that someone decided to fact check and honestly report that CJ was mistaken.

Also, I’ve decided to include my 2nd video again, because I feel it better demonstratates how silly the notion is that I “faked” anything in the first place, and because it wasn’t added until later and I don’t know if everyone caught it: