The new year is off to a great start with a flurry of press coverage for both Freedom and Anti-Social. The coverage started with Pico Iyer’s wonderful New York Times piece, The Joy of Quiet. Iyer’s reflection on finding quiet in the modern world touched a nerve – in the new years there seems to be a coalescing sense of weariness around “connecting and sharing with people in our lives.”
Yesterday’s launch of the Google “+” suite of products was a pleasant surprise. Google’s “social network” project has long been rumored, and Google’s approach to social — a suite of independent tools — was forward-thinking. It is abundantly clear that Google has great minds working on this project; I enjoyed seeing Googlers I follow start Tweeting about their parts of “+”.
The October 11 New Yorker features a review of current thinking on procrastination from James Suroweicki, and I’m pleased to note a brief nod to Freedom. The article is based on a new collection of essays on procrastination, edited by Chrisoula Andreou and Mark D. White. It is refreshing to read an article on procrastination that doesn’t get lost in causal claims about technology or how different everything is nowadays.
Last weekend’s Financial Times magazine also contained mention of Freedom. The author Katie Roiphe describes an experiment spending a week offline. Roiphe writes:
A man I meet at a party tells me about a software program called “freedom”. It asks you how long you would like to be offline (i.e. free) and you tell it, and then it disables your computer so you can’t get on to the internet for that time – or, in its words: “Freedom locks you away from the internet.” If you should suddenly need to go on the internet, you can restart your computer and disable the program, but it offers that extra bit of resistance; it is the superego, the self-control that you don’t quite have, or in its own slightly Orwellian terms, “Freedom enforces freedom”.
I’ll refer you to the article to see how the week offline goes.
One of the fascinating things about Craigslist is its informal post-sale sanctioning system. That is, if you don’t take down your post after you sold the item, you get an increasingly annoying stream of emails from people asking questions about the item. This continues, of course, until you actually remove the post offering the item you sold. It is a great example of virtual community gardening.
Because of this sanctioning system, we can make a reasonable inference that items that have been taken off of Craigslist have been sold. The items that have short lifespans on Craigslist are desirable – they are a good value, priced properly – and those with long lifespans are either unwanted or improperly priced. I’ve recently been in the market for a used car (cough, a minivan), so I’ve been collecting information about the cars offered on Craigslist and their lifespans on the service. By looking at prices and lifespans (and a few other variables), can we automatically identify cars that offer the greatest value?
What follows are some charts from a simple survival analysis of the last 30 days of Honda Odyssey sales on Craigslist in Raleigh/Durham. The de-duped dataset includes 55 cars (out of about 130 posts). Before you read much into the data, many of the variables I explored (mileage, model year, etc.) weren’t significant predictors of “hazard” (that is, sale). If you were able to get this data on a larger scale, it does seem likely you’d be able to identify patterns of value. That said, there is a lot of randomness is a car’s quality once it has been driven, so the value of such a model-based approach would only be in prioritizing potentially under-priced cars.
n.b.: You could also do this sort analysis on want-ads. Want-ads have a great sanctioning system, as it is pointess to pay for an ad after you’ve sold your car.
p.s.: Perhaps what is charming about Craigslist is that there isn’t any meaningful historical data. This likely generates more variability in price, leading to the perception that you can find great deals (which you can!).
If you recall, a few years ago Facebook forced all users to select a gender if they wanted to continue using the site. This move generated a little controversy – some individuals didn’t feel comfortable with sharing the information, or fitting into a gender classification. Facebook responded:
However, we’ve gotten feedback from translators and users in other countries that translations wind up being too confusing when people have not specified a sex on their profiles. People who haven’t selected what sex they are frequently get defaulted to the wrong sex entirely in Mini-Feed stories. For this reason, we’ve decided to request that all Facebook users fill out this information on their profile.
Just today, I discovered (via the R Bloggers news feed) an video on the use of R in corporations like Google and Facebook. The representative of the Facebook data team talked about some exploratory data analysis they did in 2007. The finding? “If a user comes on more than once and is willing to give Facebook a very basic piece of information – their gender – that seems to be the strongest predictor of whether they will stay on the site.”
I’m not looking to stir up any controversy. Rather, I think it is an interesting example of analytics-based development, of research informing design. Of course, the challenge of translating research into practice is immense. Are there critical differences between individuals that share gender and those that don’t? Did a forced gender-selection process invalidate the predictive model? Was the controversy over gender selection worth the predicted benefit? Perhaps Facebook’s 500 million users owe more to gender selection than we can imagine.
Anyway, the video has some age on it, but I did enjoy hearing about Facebook’s use of R (the other analytic examples provided are cited in the “Maintained Relationships on Facebook” report, plus there are a few ICWSM papers, I believe). You can find the full video here (doesn’t look like embed is supported).
Update: Please see the response from Cameron Marlow, Facebook Data Team lead, in the comments. Cameron provides great context for this finding.