If you recall, a few years ago Facebook forced all users to select a gender if they wanted to continue using the site. This move generated a little controversy – some individuals didn’t feel comfortable with sharing the information, or fitting into a gender classification. Facebook responded:
However, we’ve gotten feedback from translators and users in other countries that translations wind up being too confusing when people have not specified a sex on their profiles. People who haven’t selected what sex they are frequently get defaulted to the wrong sex entirely in Mini-Feed stories. For this reason, we’ve decided to request that all Facebook users fill out this information on their profile.
Just today, I discovered (via the R Bloggers news feed) an video on the use of R in corporations like Google and Facebook. The representative of the Facebook data team talked about some exploratory data analysis they did in 2007. The finding? “If a user comes on more than once and is willing to give Facebook a very basic piece of information – their gender – that seems to be the strongest predictor of whether they will stay on the site.”
I’m not looking to stir up any controversy. Rather, I think it is an interesting example of analytics-based development, of research informing design. Of course, the challenge of translating research into practice is immense. Are there critical differences between individuals that share gender and those that don’t? Did a forced gender-selection process invalidate the predictive model? Was the controversy over gender selection worth the predicted benefit? Perhaps Facebook’s 500 million users owe more to gender selection than we can imagine.
Anyway, the video has some age on it, but I did enjoy hearing about Facebook’s use of R (the other analytic examples provided are cited in the “Maintained Relationships on Facebook” report, plus there are a few ICWSM papers, I believe). You can find the full video here (doesn’t look like embed is supported).
Update: Please see the response from Cameron Marlow, Facebook Data Team lead, in the comments. Cameron provides great context for this finding.








Thanks for the reference Fred.
Perhaps you and your other wordpress.org math bloggers would also be interested in my my free and open-source math publishing plug-in. It is an easy way to add some equations to a math conversaiton: blog post and comments:
http://www.embeddedcomponents.com/blogs/wordpress/wpmathpub/
ha! More of Facebook claiming that they “have to” do things to make the site work, but really they just want more data. I’ve been on Facebook almost since it was first expanded to my then-college, and I have never answered that question.
I don’t know what it does in languages that are more strongly-typed for gender, but in English it just uses ‘their’.
You can find a list of the papers we’ve published on the data team page:
http://www.facebook.com/data?v=app_4949752878
Also, on interpreting Itamar’s statement, I think the right way to interpret this is that some people are willing to disclose more information than others, perhaps signaling their interest or investment to the service. If we had allowed age or name to be optional, I am confident that the same effect would have been found.
Glad to see the video is online though!
Thanks for giving credit to R-bloggers.com
Cameron – thanks for the link. There are a few papers I haven’t seen there.
I assume that sharing gender isn’t just an effects code for “populating the service minimally.” If it was the case that people that don’t share gender largely have blank profiles and don’t come back, the finding isn’t that interesting. But if there was similar disclosure patterns between the treatment groups and gender really did have a big effect, there’s something interesting there. To me – at least – sharing gender doesn’t seem like a big step (compared to some of the other commonly shared elements in FB). We fill in gender buttons in forms all the time. That is why it sticks out to me – but I’m going on heuristics.
[...] Why Gender is Important in Facebook [...]