Posts Tagged: statistics


31
Jan 07

You're not my Friend: A new look at Privacy on Facebook

In some of my earliest research, I analyzed privacy utilization on Facebook. My, how things have changed. As Facebook has grown into a social force on campus, reality has caught up with its users. Facebook has led the way by introducing very advanced privacy functions, which appear to be highly utilized by students. I wanted to revisit some of my earlier work, to answer the simple question – “What percentage of undergraduate students are employing profile-level privacy on Facebook?”

To answer this question required that I test privacy from the four dominant account types – Undergrad, Grad Student, Faculty and Staff. Using this perspective, I utilized Facebook’s random browse technology to sample privacy utilization for 500 undergraduate accounts (total n=2000, in 4 experiments of 500). I wanted to answer a simple question – as an account-holder with no particular ties to the undergrad, what percent of undergrads utilize privacy to protect their account. The numbers that came back were quite interesting, and presented in the following graph.

The changing face of privacy in Facebook

This graph shows that the 19% of undergraduates protect their account from other undergraduates, 24% percent of them protect their profiles from grad students, 27% of them protect their profiles from faculty, and 29% percent protect their profile from staff. The margin of error on this sample is +/- 4%, so all of these have overlapping confidence intervals with the exception of other undergrads and staff.

I think the important thing to take away from these stats is that the importance of privacy is settling in with students. We’re no longer in a paradigm where privacy isn’t important to students; they are mindful of their privacy and are acting to protect their interests.

A couple notes on this data. First, it is a gross measure, only accounting for whole-profile privacy. Whole-profile privacy means you cannot browse any part of the profile. Essentially, these users have closed their profile to outsiders completely. It is likely that within the subset of found profiles, many of the students are employing privacy in one way or another (whether it be limiting their profiles, censoring their news feeds, etc.). Also, I am a little skeptical of Facebook’s random browse function, so treat this data as a point among many, rather than a definitive point. Finally, this data was collected at UNC, so appropriate disclaimers about generalizability, etc.


7
Jan 07

Pew on Social Networks: 7 out of 10 teens have non-public profiles

This afternoon, the Pew Internet and American Life Project released data from their recent survey of teen Social Networks use. Needless to say, the data is very interesting, and offers data affirming some of the themes we’re seeing in recent SNS research. danah has blogged her reaction to the data, and I’m sure this report will echo substantially through our corner of the blogosphere.

In my opinion, the key datapoint presented in this study is employment of private (friends-only or otherwise restricted) profiles by teens in social network sites. In the study, it is reported that 77% of teens have a profile available online, but 59% of teens restrict these profiles to their friends. This means that only 3 out of 10 teens have a profile that is “open” to be viewed online[1], affirming a recent report out of UW-Eau Claire that teens are effectively employing privacy strategies online.

Beyond this interesting privacy statistic, a number of other trends emerge. In my social network predictions last week, I was called out a few times for saying that most people can oly effectively maintain one or two profiles. The data from Pew clearly validates this (and no, I hadn’t seen Pew’s data when I wrote my predictions). The Pew report states:

Fully 85% of teens who have created an online profile say the profile they use or update most often is on MySpace, while 7% update a profile on Facebook. Another 1% tend to a primary profile on Xanga. Smaller numbers told us they have profiles at places like Yahoo, Piczo, Gaiaonline and Tagged.com.

Also interesting is how young people use social networks. As reported in much of the qualitative and survey-based SNS research, the grooming of friendships is a key motivator for SNS use. According the the report, the sites are heavily used for low-intensity friendship connections – they present an easy and efficient way to keep in touch with friends new and old, from far and near. Additionally, teens are motivated to update their sites frequently, with the study reporting that “a social network profile is more engaging if it changes frequently.”

The Pew report also reflects how modes of communication and access are changing with social network adoption. Friend-to-friend contact is now occuring via SNS public and private messages (already been said many times before, but young people don’t email). Additionally, students change their behavior based on access context – i.e. school users vs. those who can access SNS at home.

The report includes some valuable summary statistics, including the following I found interesting:

  • 55% of online teens have created a personal profile online, and 55% have used social networking sites like MySpace or Facebook.
  • 37% of 12 and 13 year olds have SNS profiles.
  • Seven out of ten (70%) online girls 15-17 have created a profile, compared with 57% of older boys.
  • Almost half of social network-using teens visit the sites either once a day (26%) or several times a day (22%).

Among many things, this report reinforces the ubiquity of social networks. If I had to come up with a 10,000 foot overview of this report, it would be “young people are using social networks all the time, for everything.” At the same time, there are a number of interesting between-gender effects in the statistics. As it becomes clear that females and males use SNS differently, what does this mean for researchers and those developing commercial applications? All in all, it is a very interesting report, kudos to Amanda Lenhart and Mary Madden for their good work. You can pick it up free at the Pew site.

[1] 77% of teens report having a publicly viewable profile, 40% of which say it is viewable by all persons. This leaves 30.8% of the total with true publicly viewable profiles.


31
Oct 06

Friendship in Youth Social Networking

I came across a Harris Interactive poll entitled Friendship in the Age of Social Networking Websites that contained an interesting statistic – the average number of friends teens keep on respective buddy lists.

The poll found that teen have an average of 52 friends on the IM buddy list, 38 friends entered in their cell phone – but they have 75 friends in SNS. The poll also found a 75% of teens use SNS. This is a useful point of comparison for researchers interested in the nature of friendship on SNS. Are there transferable ratios between various communication devices that hold steady for young people? Can this shine light onto how many “real” friends teens have in social networking services?

In Ling and Yttri’s chapter “Control, Emancipation and Status: The Mobile Telephone in the Teen’s Parental and Peer Group Control Relationships”, they showed that the average Norwegian teen (groups 13-15 and 16-19) had 90 friends in their register. On a weekly basis, they called little under 10% of those individuals. On a monthly basis, that ratio rose to about 25%. Of course, using a phone call as a measure of frienship is suspect – we likely have plenty of friends that we don’t call often, and this doesn’t make them less of a friend. However, the statistic is useful as it shows with how many of those friends teens are maintaining active relationships.

The other statistic from the Harris poll that interested me was a question about what types of friends teens and young people have. The poll found that for children 8-12, only 8% had friends who they only knew online (web friends), but that number grew to 36% for teens 13-18. The poll also found that 37% of children 8-12 actively maintain online/offline frienships.


11
Oct 06

Perspectives on the Comscore Data

In the past few days, a number of voices have emerged challenging the value of the recent Comscore data on social networks use. danah’s interpretation is the one to read, and has sparked a good amount of discussion in the blogosphere.

The concerns about Comscore are twofold. First, danah and a number of other bloggers are concerned that Comscore’s methodologies are flawed. Comscore relies on users to install an agent that monitors computer use. The methodology assumes that users will sign into their proper accounts – the accounts are tied to demographic information. Obviously, this methodology has its limitations, but a lot of the noise in the data is taken care of when the sample size is as large as Comscore’s.

Bloggers seem to disagree on this point. WWWscope states that the samples are “not representative of internet usage.” Joe Duck, on the other hand, seems to feel that the methodology is sound, but he agrees with danah that the data just feels wrong. Irina, commenting to danah’s post, argues that Comscore’s random dial methodology is incorrect.

Personally, I believe that the data is sound but flawed. It is inherently imprecise, but as I examined previously, the data is more or less correct when analyzed properly. And when analyzed properly, all we are seeing is that lots of young people use Xanga, lots of college students use Facebook, lots of post-collegiates use Friendster, and everybody used Myspace – which is more or less correct. The data is relative, and it isn’t speaking to absolute numbers, which is the key problem with the analysis. Finally, to Irina’s point about random dial methodology, this was an issue that came up in the 2004 elections, and was eventually proven to be fairly insignificant.

The second problem is in the analysis of the Comscore data. Bloggers Mashable and John De Mayo have found some issues in the interpretation of the data. Cnet is also following up on the issue. As far as I can tell, most influential bloggers have not taken a second looks at the data, especially after GigaOm’s Liz Gannes “fact checked” the data with a partisan source. While Comscore’s data is imperfect, the real problem is in this incorrect analysis of the data. Now that the Associated Press has reported this analysis, and it has been echoed on places like Slashdot, it will be hard to convince people that the data wasn’t right to begin with – the meme has already spread.

It will be interesting to see if the A-list begins to change its tune on the data. As we’ve seen over and over[1], the A-list bloggers seem hopelessly out of touch with what actually goes on in youth-oriented social network sites. The fact they are willing to accept a statistic that 33.5% of Facebook’s users are 35-54 without so much as batting an eye is troubling. I wonder if the quality of analysis we get from the A-list is as bad across all sectors as it is in the social networking space, a space they really should excel at covering.

[1] A particularly telling example – GigaOm changed the title of this post from “Facebook makes itself useful” to “Facebook backlash” to revise their coverage.


6
Oct 06

What Comscore’s Traffic Numbers Really Mean

The recent Comscore analysis of social network websites’ audience is being widely and incorrectly reported across the blogosphere and news media. If you haven’t seen the report, here’s a quick glimpse at the takeaway statistic.

As you can clearly see, the statistic being reported is unique visitors, a common web analytics statistic indicating that an IP address has made one visit to a website. Unfortunately, the statistic being reported in the press is user, which is distinctly different from a unique visitor.

Let’s consider the following case. A parent knows that their child has a Myspace page. That parent visits Myspace.com, attempting to learn about the service. In Comscore’s index, they would be validly counted as a unique visitor. However, in conflating unique visitor with user, the press and blogosphere is inaccurately assessing the age range of the site’s userbase.

If we can, lets step back and think about this objectively. Do we reasonably believe that 33% of Facebook’s userbase is between the age of 35 and 54? In browsing graduate student use of the Facebook, we can clearly see that even grad student age range skews very young. Applying this logic to Xanga, a youth social network – do we reasonably believe that 34% of Xanga’s userbase is between 35 and 54? Of course not.

Comscore’s methodology is sound here, but the breathless conflation of user and visitor (guilt: Blogosphere, Media and Myspace’s press spokesperson [1]) is terribly misleading. So how can we explain this phenomenon? We can elaborate, and use the other age ranges as a check. Let’s first state a concept – that a majority of the 35-54 users are actually parents of young people, who are making visits to social websites to learn about the sites their children use. We’ll then state a hypothesis that these users are not actually joining these social networking sites in any significant trend. To support this hypothesis, we will examine the relationship between visit trends and age-appropriateness for each age range. If we determine that a positive relationship between visit trends and age appropriate social network sites exist for our age ranges, we will declare that a relationship exists. We will then see if any similar relationships exist for our 35-54 age range.

First, lets look at our other age ranges. A positive relationship will be defined as significant derivation from the mean for our age-appropriate SNS inside the age range. For our youngest users, Xanga is the appropriate SNS. For our college-age users, Facebook is the appropriate SNS. For our 25-34′s, Myspace and Facebook are our appropriate SNS [2]. For each of our age ranges, we can see that visit trends derivate significantly from the mean for our context-appropriate SNS. This allows us to conclude that the existence of a significant positive relationship between “using” an SNS and deviation from the mean. Therefore, if no derivations from the mean exist in our 35-54, we will declare that no relationship exists – meaning that those age 35-54 do not actually “use” social networks “more” -i.e., the notion is meaningless.

As we look at our 35-54 age range, we see very little derivation from the mean across our four social networking services. This lack of variance indicates that parents demonstrate non-selectivity in social networking sites. Apparently, this age range simply “visits” all sites, and is non-preferential. Since social networking sites clearly target age demographics, doesn’t this seem a little odd? Of course it does – because our visitors actually aren’t becoming members of the site, they are simply visiting them for more information. The above 35 “members” of the sites are simply coming out in the wash.

The point I’m getting at is this: we see a clear positive relationship between unique visitor variation from the mean for context-appropriate SNS. More young people visit Xanga. More college students visit Facebook. More twenty- and thirty-somethings visit Facebook. However, no 35-54 year olds demonstrate any preference (no significant positive relationship) toward any social networking site. These adults simply visit these sites, for many good reasons, but they aren’t joining them in a significant trend. This is a rejection of a hypothesis that “more” adults are joining social networks. If anything, we can say that adults 35-34 aren’t joining sites like Facebook, Friendster and Xanga.

While I think it is obviously the case that social networking is penetrating a wider demographic, the Comscore statistics can only tell us so much. They simply tell us who is visiting a site – for any reason – and they clearly don’t show a relationship between adult use of SNS. The media has this quite wrong, unfortunately.

[1] In danah boyd’s talk, she explains the non-reliability of self-reported ages in Myspace. Young folks frequently self-identify as being the maximum age possible, a non-symmetrical relationship.
[2]For the purpose of this comparison, Myspace will be considered an outlier due to its size.


4
Oct 06

New SNS Privacy Statistics

The National Cyber Safety Alliance has released a fairly interesting new study on social network privacy behaviors. The NCSA is a non-profit funded by government groups such as DHS and the FTC, and this report was released to kick off National Cyber Security Awareness month.

The survey examines how much time we spend on social networks and how much information we share on social networks (and how we limit access/privacy). Additionally, it looks at our awareness of the possibility of our SNS data becoming public, and how we deal with things like unsolicited contacts.

The unsolicited contact stats are quite interesting, especially when broken down by age. The survey clearly shows that the younger groups are less likely to respond to unsolicited contacts. If this survey had a 13-18 age group, I’m almost certain the percentage of non-response to unsolicited contacts would be even higher than the 18-34. The notion that youths commonly respond to unsolicited contacts in social networks is clearly unfounded.

There are some other findings in the study that will be useful for SNS researchers, particularly with regards to use and disclosure behaviors. Additional statistics on parental monitoring will be useful as well. The NCSA oriented this study towards cyber-crime in SNS, which ultimately leaves me wishing they explored some of the more complex issues in SNS. The NCSA is also guilty of hyping the fear/moral panic issue on their press release, declaring that “51 percent of parents aware of their children social networking do not restrict their children’s profiles so only friends can view, leaving their child’s profiles unrestricted to potential predators.”

Link to the survey (ppt) here. I’ve uploaded a pdf version if you’d like that instead.


7
Jul 06

Adopting the Facebook: A Comparative Analysis

Over the course of the past year, the Facebook has changed extensively. In addition to their basic social network offerings, they’ve introduced photo sharing, mobile – and they’ve expanded their network extensively to include things like businesses and high schools. Facebook’s strategic moves, coupled with an avalanche of press coverage (some good, some not-so-good) has truly elevated the service to a rare position; along with Myspace, the Facebook is becoming a household name. As we reach the mid-summer mark, I thought it might be interesting to explore how all these changes might affect the adoption of Facebook by its key demographic – inbound college freshmen.

As the Facebook has opened up its doors to high school students and businesses, you may wonder why college freshmen are still important to the Facebook; that is a relevant question. Although I obviously don’t have hard evidence (only FB has that), I do believe the inbound college population represents the first-class Facebook user. For these students, Facebook is situationally relevant – a topic I’ve explored extensively. And the adoption numbers I’ve seen among college freshmen populations do nothing to dissuade me of my belief. While Facebook high school and business are important, the college freshmen truly represent the feeder program of the Facebook.

In previous research (link to all my FB research) I found that the college freshmen at UNC-Chapel Hill adopted Facebook at a 94% rate; in doing that study, I found that the majority of freshman adoption was during the summer. The Facebook is truly a killer app for incoming freshmen – as they prepare to start a new life in a new place, surrounded by a new social network, the Facebook presents a highly interactive way to explore this new space. For those of us who sent snail-mail letters to our freshman year roommates, Facebook is everything we could have dreamed of and then some – not only can students know everything about their new roommates, but they can learn everything about their suite, their floor, and their dorm. This is information students need to know, and it helps them get situated in their new social networks.

As my previous study included data collected in summer 2005, I thought it might be useful to compare that data against data collected this summer. This study will compare snapshots of freshman use of Facebook on the UNC-Chapel Hill campus on June 27, 2005 and June 27, 2006. I wanted to answer a few basic questions – how is this new class adopting the Facebook compared to last year’s? Are there any new trends emerging? Seeing as the Facebook quite literally couldn’t get any more popular than it was last year, have students left the Facebook to seek refuge at other services? As these freshman are, in my opinion, the true predictors of Facebook’s success – what does their adoption tell us about the health of the service in general? I think my results may surprise you.

First, a word about my methodology. The criteria for being a “freshman” in my survey is either 1) self-reporting a freshman status or 2) self-reporting graduation from high school the previous year. This is a slight expansion of my previous methodology (previously I only accepted people who self-stated their freshman status), but I wanted to get these numbers as close-to-correct as possible. Of course, Facebook can make up information about themselves; there is no notion of validation in this survey, so please understand that as you digest this data. There is little incentive to lie in the Facebook, however, as Millen and Patterson found, virtual communities that are closely tied to real world communities create disincentives for deception. This data was collected automatically, though it could have been collected by hand; of course, no personally identifiable information is being re-reported in this study. Other researchers, such as Gross, Aquisiti and Heinz and Jones and Soltren have completed similar work. In both data sets, I am dealing with whole-network data, there is no sampling involved unless I state so. As things have changed in the Facebook over the course of the past year, I’ve had to make tweaks to the methodology, so I’ll try and make them as clear as possible.

Anyway, on to the data! I set about this research with a central question of how the incoming Freshmen class of 2006 was adopting the Facebook as opposed to the class of 2005. The first, and most important number is certainly adoption. On June 27 of last year, 1655 inbound freshmen had already created accounts at UNC. This year, 1715 had created new accounts as of June 27.

Facebook Snapshot: Freshmen Profiles as of June 27
Click image to see larger version

Please note, excel started this graphic at 1620 – it exaggerates the difference. Although I can say for certain there are at least 1715 Freshmen who have created Facebook accounts in the class of 2006, the number is actually higher. In my class of 2006 extraction methodology, I could not include people who I couldn’t see; using a random sample of 500 freshmen generated from the Facebook’s browse function, I was able to extract a privacy rate of 6%, +/- 4. Therefore, there could be as many as 10% more freshmen than my study has found; however, since I have absolute data, I know the number is not less than 1715. Last year’s total of 1655 was irrespective of privacy – so that number is +/- 0. As the Facebook has changed over the past year, I’ve had to adapt my methods. The good news is the rest of this data will be a pretty straight comparison; I will just be comparing in-population statistics and not generalizing outside of them.

As the Facebook is a social networking service, the friend-making behaviors of students in the service is very telling; if students are making friend connections in the service, the service is healthy. Indeed, the friend statistics of my two populations were very interesting.

Freshman Friend Network Size Patterns in the Facebook
Click image to see larger version

There are two statistics here – in-network and out-of-network. The in-network statistics represent friends at UNC; the out-of network statistics represent friends at other institutions, businesses, high schools, geolocations – areas where the Facebook has expanded in the past year. As we can see, friend-making behavior in-network has not changed significantly, but out-of-network friend-making has increased quite substantially. The average freshman Facebook user at UNC now has 125 out-of-network friends. This is a very telling statistic which I will explore further – but let’s first explore the out-of-network phenomenon a little more.

The following chart represents the unique networks UNC freshman Facebook users participate in. In 2005, the only networks available to students were other colleges. In 2006, students can participate in high school, college, work and geolocated networks. As we can see, this expansion has been embraced by the freshmen.

Facebook's Expanding Reach
Click image to see larger version

In the next chart, you’ll see that this year’s freshman class participates in an average of 15 more networks than last year’s. Compound these numbers by the fact that students are making more connections in these external networks, and you can see that the Facebook’s expansion has profoundly affected friend-making behaviors in the service.

Facebook's Expanding Reach, Part 2
Click image to see larger version

I believe this new friend-making behavior is a key indicator of Facebook’s potential – and it means a few things. First, it may mean that the inbound freshman who are joining UNC’s network already have account prior. Whereas last year’s class of Freshmen were starting anew on the service, this year’s class already has their high school Facebooks. When you compare the diversity of networks, and the dramatic increase in out-of-network friend connections, you see that students are already using their Facebook’s before they get to campus. To test this hypothesis, I looked at Wall activity of the freshmen.

Average Wall Messages Per Freshman
Click image to see larger version

The freshman Facebook user of 2005, who had just joined the service, had an average of 8.5 wall messages; this year, the average number of wall messages of a freshman Facebook user increased by more than five-fold to 45.5 messages. Certainly, wall use behavior could have changed over the past year, but coupled with the data I’ve presented, I strongly believe this shows this year’s class of freshmen Facebook users are already experienced Facebook users.

I believe this is very important as an indicator of Facebook use among its core feeder base. At face-value, the in-network adoption numbers from last year to this year are quite similar – which shows that Facebook is again on the course to 94% participation by freshmen in the service. Couple this with the fact that students are investing time into creating rich social experiences in other networks, and you’ve got a service that students are truly embracing. The discussion of adoption amongst freshmen is no longer “When will they join the Facebook” – it seems as if they are already there. Rather that students creating new accounts when they come to UNC as Freshmen, they are simply adding another network into their already-established profiles. As the networks the Facebook allows students to create are situationally relevant (answering their social information needs), it does seem that their expansion strategy has been quite successful.

For the purpose of my study, these numbers were incredibly enlightening. The Facebook has deeply penetrated its core market, and it does seem to be successful in attracting its most valuable audience – students who will take the Facebook with them through their life’s networks. Indeed, the Facebook is leaving is leaving the cloistered halls of academia, and doing so successfully.

As with these studies, I always like to look at a few data points that interest me. One of the most interesting thing about the Facebook is the fact you can share your political orientation. Its very interesting to me…I just feel that your politics may say more about you than anything else on your profile. So what are the politics of UNC’s inbound freshman Facebook users?

Political Orientation as Reported by Freshmen in The Facebook
Click image to see larger version

Looks pretty much like last year, but there is a key difference. Last year, 71% of inbound Freshmen reported a political preference, whereas this year only 59% did. I take from this that students get the fact their politics say a lot about them – and they may be not mentioning their politics so it doesn’t close any doors to potential friendships. It’s interesting self-censorship.

I also felt it would be interesting to see what students are linking to as their external pages; in a statistic that really speaks to how college students feel about Myspace, only 5.6% of freshmen in the Facebook linked to a Myspace profile. Take from that what you will, but it truly fuels my belief that Myspace is looked down upon by Facebook users (this is not to say they don’t have Myspace accounts, but that they truly prefer their Facebook accounts).

Indeed, I’ve thrown a lot of information at you here. Thanks for reading this far. This research has been vastly enlightening to me, as I hope it has been for you. I fully believe we’re looking at another year of Facebook dominance of its core market; at the same time, I believe this data strengthens my hypothesis that Facebook works due to situational relevance. The Facebook answers the social information needs of students, and as a result, they love it. As parents, students and college administrators struggle to comprehend the Facebook phenomenon, I hope this data proves insightful. I have conducted this research to shed light on this area, a light that I believe is beneficial to all. I do believe the Facebook (or at least the Facebook’s situationally relevant model) is here to stay; understanding how our students adopt and use these social networking services is extremely important. While we may not be able to change how our students use the Facebook, the first step to guiding their behavior is understanding it. I do hope this has proved useful.

Fred Stutzman is a Ph.D. student at UNC-Chapel Hill’s School of Information and Library Science, and the co-founder of ClaimID.com. This report may be downloaded as a white paper.