Perspectives on the Comscore Data

In the past few days, a number of voices have emerged challenging the value of the recent Comscore data on social networks use. danah’s interpretation is the one to read, and has sparked a good amount of discussion in the blogosphere.

The concerns about Comscore are twofold. First, danah and a number of other bloggers are concerned that Comscore’s methodologies are flawed. Comscore relies on users to install an agent that monitors computer use. The methodology assumes that users will sign into their proper accounts – the accounts are tied to demographic information. Obviously, this methodology has its limitations, but a lot of the noise in the data is taken care of when the sample size is as large as Comscore’s.

Bloggers seem to disagree on this point. WWWscope states that the samples are “not representative of internet usage.” Joe Duck, on the other hand, seems to feel that the methodology is sound, but he agrees with danah that the data just feels wrong. Irina, commenting to danah’s post, argues that Comscore’s random dial methodology is incorrect.

Personally, I believe that the data is sound but flawed. It is inherently imprecise, but as I examined previously, the data is more or less correct when analyzed properly. And when analyzed properly, all we are seeing is that lots of young people use Xanga, lots of college students use Facebook, lots of post-collegiates use Friendster, and everybody used Myspace – which is more or less correct. The data is relative, and it isn’t speaking to absolute numbers, which is the key problem with the analysis. Finally, to Irina’s point about random dial methodology, this was an issue that came up in the 2004 elections, and was eventually proven to be fairly insignificant.

The second problem is in the analysis of the Comscore data. Bloggers Mashable and John De Mayo have found some issues in the interpretation of the data. Cnet is also following up on the issue. As far as I can tell, most influential bloggers have not taken a second looks at the data, especially after GigaOm’s Liz Gannes “fact checked” the data with a partisan source. While Comscore’s data is imperfect, the real problem is in this incorrect analysis of the data. Now that the Associated Press has reported this analysis, and it has been echoed on places like Slashdot, it will be hard to convince people that the data wasn’t right to begin with – the meme has already spread.

It will be interesting to see if the A-list begins to change its tune on the data. As we’ve seen over and over[1], the A-list bloggers seem hopelessly out of touch with what actually goes on in youth-oriented social network sites. The fact they are willing to accept a statistic that 33.5% of Facebook’s users are 35-54 without so much as batting an eye is troubling. I wonder if the quality of analysis we get from the A-list is as bad across all sectors as it is in the social networking space, a space they really should excel at covering.

[1] A particularly telling example – GigaOm changed the title of this post from “Facebook makes itself useful” to “Facebook backlash” to revise their coverage.

Tags: ,

5 comments

  1. Fred, I agree that this is more an issue about interpretation and analysis than limitations with comScore’s methodology. (all methods have limitations..) You, Danah, Joe, and others have been doing a nice job the past few days of breaking down the press release. And the debates are beneficial for future research and news reporting.

    I exchanged some of the following w/ Danah on email last night, and my observations are..

    - comScore may have slipped a bit by interchanging “users” and “visitors” in the press release. But this was nothing sinister or that misleading. I thought it was pretty clear they meant the latter and not the former, all the way through.

    We who follow these websites care about such differences and want this clarity. But I think the A-list reporting really got this off-base as they went much further in their own interpretations.

    - The press release was about demographics, not behavioral trends.

    This was a 30,000 ft from above bird’s eye view, so to speak – not an analysis of in-depth behavioral trends like intensity of usage or networking patterns — just visitation.

    - There are reasonable explanations to describe demographic changes in visits.

    Danah and others mentioned some of these, such as parents checking on kids. (but not registering or setting up accounts..) Also, the surge in popularity for MySpace Blogs and MySpace Videos, with accompanying use of external feeds, probably explain some of the changes as well.

    - I think our inclination (not without a little skepticism) should be to consider the reporting by third party tracking organizations, like comScore and Nielsen, rather than the website organizations themselves.

    there go some future research projects..

    Why is this? My experience with some of these social networking website companies, especially the younger startups, is that they don’t have good protocols and metrics (if any at all) for evaluating their users. This is eerily similar to my former work doing educational research analyzing achievement of K-12 schools and school districts – data collection and self-evaluations present big big problems.

    So I’m still believing comScore on “visits” rather than what these companies would say to us… Most of these social networking organizations have very crude ways of looking at their users. (most at no fault of their own, just lack of resources) comScore at least has had a sampling method that has been tested and re-tested over the years.

    It’s also likely to be in the social networking companies’ own best interests to maintain the conventional perception of the kinds of visitors or users on their sites.. My hunch is that if a lot of MySpace teens found out that they represent only 15-20% of who are on MySpace, that would be a turn off, and they might want to go to a majority teen site. I don’t think MySpace would be thrilled about such a development.. especially if a higher proportion of the teen cohort is registered compared to the older groups.

    Sorry that this is so long..

    This is a good debate.

    Thanks to you and others for shedding light on the importance of precise definitions, analysis, and thorough reporting.

    - Paul

  2. When this was brought up in last night’s talk on social networking, I was hoping you’d jump in to add some context to those numbers. I guess thats not appropriate for the moderator to interrupt the panel…

  3. Paul – Fantastic analysis – I agree completely. Its a great check to see someone with your experience checking in on this data. Thank you.

    Of note is that Michael Rubin of Comscore has posted the following boyd’s blog. It is very illustrative:

    We (i.e. comScore) would like to clarify some of the issues and answer the questions being raised in this conversation.

    First and foremost, we don’t rely on the age that individuals submit when they register for a MySpace account. Our demographic data are based on the ages of the individuals in a household that we record when they join the comScore panel. That means we do not need to use the age the individual provides when they register at MySpace. Any accuracies inherent in that are not reflected in our data.

    Regarding the issue of “users” vs. “visitors”:

    * We use the terms interchangeably and do not mean to imply that a “user” of the site is necessarily a “registered user”.

    * As you rightly point out, our press release was talking about unique visitors. We anticipated there might be some confusion, so we made sure the headline clearly indicated visitors (“More than Half of MySpace Visitors are Now Age 35 or Older, as the Site’s Demographic Composition Continues to Shift”).

    * The data we highlighted in the release does not speak to engagement or intensity of usage — just visitation.

    Let’s put this whole story in context.

    More than anything, an aging visitor base speaks to the fact that MySpace has filtered into the mainstream. While older visitors may be less likely to be registered users, it’s still worth noting that they are being directed to the site one way or another:

    * In some cases, they are linked to people’s blogs at MySpace (especially from search results).

    * In other cases, they are being linked to videos. Our Video Metrix data shows MySpace is #1 in videos streamed in the U.S.

    * Or perhaps they are just curious to see what the buzz is all about and what their kids or grandkids or the media are talking about.

    We hope this clarifies some of the issues being raised here. If you have any further questions, please don’t hesitate to email us at press@comscore.com.

  4. Hi – thanks for the link.

    To be clear on my postion: I agree that the actual numbers of vists Comscore gives may be rougly correct (or at least as “correct” as anything else)

    However, the demographic stats they provide are wildly inaccurate. Consider this: Joe Huskins says: “Our demographic data are based on the ages of the individuals in a household that we record when they join the comScore panel.” Since, according to their privacy policy they don’t accept sign-up by users under 18, that means all they are measuring for 11-17 year old is – at best – vists to MySpace on shared computers where the parent signed up for the MarketWatch toolbar. I simply don’t see how that group can possibly be anywhere near a proper sample of MySpace visitors.

  5. Michael Rubin, comScore

    Fred —

    Here is a copy of the comment we posted to clarify our answers and address more of the concerns Danah outlined. Thanks for reposting our first comment and for allowing us the opportunity to re-post here as well.

    Here is a link back to the whole thread on Danah’s blog:
    http://www.zephoria.org/thoughts/archives/2006/10/10/comscore_misint.html

    …Michael

    —————-

    Danah –

    Thank you for allowing us the opportunity of addressing your primary concerns. I’ve left your questions and comments as you wrote them, and highlighted our comments with hash marks (>>>).

    1) How do you guarantee who is logged into a particular computer when they visit any of these social sites?

    >>> Here is an outline of one of the proprietary methodologies we use to identify who is using a computer at any point in time.

    * We obtain explicit permission from our panelists to observe their Internet behavior, and we have the ability to see what they do from the beginning of a session until the conclusion of that session.

    In the vast majority of these sessions, there are behavioral indicators (e.g. a username, an email address, etc.) that allow us to passively identify the user and match that user with the specific individuals in the household according to the information provided by the panelist when they first joined the comScore panel.

    * For those sessions in which there is no indicator, that data is not included when compiling our age-based data.

    * Since we have more than 2 million Internet users in our panel who each engage in dozens of online sessions per month, we have a large sample to report accurately on demographics — even after excluding those sessions from the data set that are not individually identifiable.

    * As an ultimate validation of whether or not individuals are being identified correctly, we tabulated our data for single-person households only. The single-person household data revealed an almost identical visitor penetration and page usage pattern for MySpace.com among older age segments as we saw within the multiple-member households, confirming that any potentially inaccurate identification of individuals within multiple-member households (even if it did it exist, which we believe it does not) is not a factor driving the older MySpace.com profile. < <<

    2) Why are 40% of all of your visitors in the 35-54 age range?

    >>> More methodology:

    * Every month, we run an enumeration survey based on a random sample of the U.S. population to determine the demographic characteristics of the Internet population. As is done in virtually all sample-based studies conducted in the market research industry, these enumeration data are used to calibrate the comScore panel to ensure it is demographically accurate.

    * Our enumeration surveys show that that nearly 40% of the U.S. Internet-using population is in fact between the ages of 35-54.< <<

    3) Why are you emphasizing MySpace? Your data shows that ALL of the key teen sites are primarily visited by 35-54 when that makes absolutely no sense.

    >>> Part of the reason why we issued the press release is because the data are so fascinating. It certainly runs counter to the previous conventional wisdom that it?s primarily teenagers who visit MySpace. In this case, we have evidence to illustrate that visitors to social networking sites are actually older than one might expect. We emphasized MySpace because it is the biggest and most newsworthy site.

    * Please note that our recent release focused on visitation, and did not comment on usage intensity. In fact, our data does show higher engagement numbers (i.e. higher number of page views) among the younger demos. < <<

    4) You work in marketing. When tech companies talk about “users” they are talking about people who actively participate on the sites. When it comes to any social site with a login, this means people who actually login. Why do you conflate these terms if you anticipated confusion? It’s a press release.

    >>> There was no intent to conflate the terms “users” and “visitors”, but now recognizing that there could be confusion we acknowledge that this is a valid critique and something we will be more careful with in the future. Please look at the media coverage this story has generated and for which comScore offered commentary. We have been crystal clear on the distinction between “visitors” and “registered users”. < <

    Basically, i don’t believe that the visitor base on ANY of these sites is aging and i have major concerns about your methodology. As a participant in the tech industry, i want to know how much i can trust you. Not a single one of these companies believes that this data is true. Not a single one. I do not believe that parents/ grandparents are logging in universally to Xanga, MySpace, Facebook, Friendster. I am not surprised that they are more likely to login to MySpace but you’re showing ridiculously high numbers across the board. It sounds to me as though you’re spinning a story that isn’t real.

    >>> Danah, comScore?s methodology has remained consistent, so we believe the age shifts we’re seeing are accurate. Even so, we’ve made sure we did our homework in validating the data before issuing it.

    I also want to point out comScore is not the only research company reporting that the MySpace visitor base is aging. In fact, eMarketer published an article today quoting NetRatings data as showing that 46% of the MySpace.com users are now age over 35, compared to 38% last year. NetRatings has a totally separate panel and a different data collection methodology from comScore. Bottom line, two independent databases are reporting that the MySpace visitor base is aging.

    We understand the nature of your skepticism, and have answered your questions in detail in an effort to bring more transparency to the situation. We welcome this kind of healthy discussion, and hope it has helped alleviate your concerns about the accuracy of our data. <<<

Leave a comment