03
Dec 10

Dissertation Defense, December 8

After months of extensive research, I have proved that blogging and writing a dissertation have an inverse relationship.

I’m happy to invite you to my dissertation defense, December 8 at 10AM, in Manning 014. This is a small room so seating might be limited. Full information is available on the SILS website.

I hope to post my dissertation in a few weeks. I will share it here when it is ready for public viewing.


04
Oct 10

New Yorker on Procrastination (and Freedom)

The October 11 New Yorker features a review of current thinking on procrastination from James Suroweicki, and I’m pleased to note a brief nod to Freedom.  The article is based on a new collection of essays on procrastination, edited by Chrisoula Andreou and Mark D. White.  It is refreshing to read an article on procrastination that doesn’t get lost in causal claims about technology or how different everything is nowadays.

Last weekend’s Financial Times magazine also contained mention of Freedom.  The author Katie Roiphe describes an experiment spending a week offline.  Roiphe writes:

A man I meet at a party tells me about a software program called “freedom”. It asks you how long you would like to be offline (i.e. free) and you tell it, and then it disables your computer so you can’t get on to the internet for that time – or, in its words: “Freedom locks you away from the internet.” If you should suddenly need to go on the internet, you can restart your computer and disable the program, but it offers that extra bit of resistance; it is the superego, the self-control that you don’t quite have, or in its own slightly Orwellian terms, “Freedom enforces freedom”.

I’ll refer you to the article to see how the week offline goes.


14
Sep 10

Upcoming Talk – UC Irvine

On Friday, September 24, I’ll be presenting the following Informatics Seminar at UC Irvine’s Department of Informatics:

Title:
Socio-Technical Support Networks During Life Transition

Abstract:
Modern life is characterized by transition.  Completing education, moving between jobs and residential relocation are examples of the transitions that challenge us, enable personal growth, and facilitate the construction of our life stories.  Successful adaptation to transition is a function of social-informational processes.  During a transition, individuals are challenged to make sense of their transitional environment, while developing socially supportive resources that aid in transition.  Large-scale adoption of social media, and resultant tightly-coupled mediated sociality has the potential to facilitate life transition; through social media, individuals are able to answer situationally relevant information needs, while drawing on extended support networks.  Using observational data collected during one such transition – the transition to college – this project explores social network site information practices during life transition.  In particular, I explore the dynamics of network configuration during the early stages of transition, identifying factors relevant to the assemblage and growth of socio-technical support networks.  I then explore the outcomes of social network site use during transition, identifying information behaviors associated with adaptation to transition.

I believe the talk will be at 3PM.  I’m looking forward to visiting the Department of Informatics, as well as meeting with faculty and students.  I’ll update the post with location information as we get closer to the date.


23
Aug 10

Next Steps

I’m pleased to report that I have accepted an offer to join Carnegie Mellon University’s Heinz College as a post-doctoral fellow.  At Carnegie Mellon, I will be working with Alessandro Acquisti.  I have been following Alessandro’s excellent work on privacy and technology for many years, so I am thrilled to join his team and have him as a mentor.

Alessandro’s team has extensive experience studying privacy in online social networks.  Alessandro and Ralph Gross wrote one of the earliest (and most cited) Facebook privacy papers: Imagined Communities: Awareness, Information Sharing, and Privacy on the Facebook. Last summer, the team published a truly head-turning study, showing that information gleaned from social network profiles could be used to predict social security numbers.  Most recently, Alessandro’s work was featured in Jeffrey Rosen’s New York Times Magazine article The Web Means the End of Forgetting.

I look forward to building on my current areas of research – privacy, identity and support in social networks – while being exposed to new opportunities and new challenges at CMU.  Speaking of challenges, the next challenge is a dissertation defense (later this fall) and then a move to Pittsburgh.  It has been a while since I’ve been to Pittsburgh, so I’m open to advice!


16
Aug 10

Pricing a used Honda Odyssey

One of the fascinating things about Craigslist is its informal post-sale sanctioning system.  That is, if you don’t take down your post after you sold the item, you get an increasingly annoying stream of emails from people asking questions about the item.  This continues, of course, until you actually remove the post offering the item you sold.  It is a great example of virtual community gardening.

Because of this sanctioning system, we can make a reasonable inference that items that have been taken off of Craigslist have been sold.  The items that have short lifespans on Craigslist are desirable – they are a good value, priced properly – and those with long lifespans are either unwanted or improperly priced.  I’ve recently been in the market for a used car (cough, a minivan), so I’ve been collecting information about the cars offered on Craigslist and their lifespans on the service. By looking at prices and lifespans (and a few other variables), can we automatically identify cars that offer the greatest value?

What follows are some charts from a simple survival analysis of the last 30 days of Honda Odyssey sales on Craigslist in Raleigh/Durham.  The de-duped dataset includes 55 cars (out of about 130 posts). Before you read much into the data, many of the variables I explored (mileage, model year, etc.) weren’t significant predictors of “hazard” (that is, sale). If you were able to get this data on a larger scale, it does seem likely you’d be able to identify patterns of value. That said, there is a lot of randomness is a car’s quality once it has been driven, so the value of such a model-based approach would only be in prioritizing potentially under-priced cars.

n.b.: You could also do this sort analysis on want-ads. Want-ads have a great sanctioning system, as it is pointess to pay for an ad after you’ve sold your car.

p.s.: Perhaps what is charming about Craigslist is that there isn’t any meaningful historical data. This likely generates more variability in price, leading to the perception that you can find great deals (which you can!).


04
Aug 10

Why Gender is Important in Facebook

If you recall, a few years ago Facebook forced all users to select a gender if they wanted to continue using the site.  This move generated a little controversy – some individuals didn’t feel comfortable with sharing the information, or fitting into a gender classification.  Facebook responded:

However, we’ve gotten feedback from translators and users in other countries that translations wind up being too confusing when people have not specified a sex on their profiles. People who haven’t selected what sex they are frequently get defaulted to the wrong sex entirely in Mini-Feed stories. For this reason, we’ve decided to request that all Facebook users fill out this information on their profile.

Just today, I discovered (via the R Bloggers news feed) an video on the use of R in corporations like Google and Facebook.  The representative of the Facebook data team talked about some exploratory data analysis they did in 2007.  The finding?  “If a user comes on more than once and is willing to give Facebook a very basic piece of information – their gender – that seems to be the strongest predictor of whether they will stay on the site.”

I’m not looking to stir up any controversy.  Rather, I think it is an interesting example of analytics-based development, of research informing design.  Of course, the challenge of translating research into practice is immense.  Are there critical differences between individuals that share gender and those that don’t?  Did a forced gender-selection process invalidate the predictive model?  Was the controversy over gender selection worth the predicted benefit?  Perhaps Facebook’s 500 million users owe more to gender selection than we can imagine.

Anyway, the video has some age on it, but I did enjoy hearing about Facebook’s use of R (the other analytic examples provided are cited in the “Maintained Relationships on Facebook” report, plus there are a few ICWSM papers, I believe).  You can find the full video here (doesn’t look like embed is supported).

Update: Please see the response from Cameron Marlow, Facebook Data Team lead, in the comments. Cameron provides great context for this finding.


21
Jul 10

iTunes vs. Amazon as Survey Incentive

When surveying college-age students, Amazon and iTunes e-gift cards are frequently offered as incentive for participation [1].  While I’ve frequently heard that students prefer iTunes, the administrative burden of sending iTunes gift cards is high.  The iTunes store limits each account to $100 dollars in gift card purchases per month, so if your compensation needs go over $100, you have to schlep to the store, buy gift cards, and put them in the mail.  Amazon, on the other hand, offers an effortless interface for sending gift cards and does not appear to have an unreasonable monetary restriction.  So if you choose the ease of Amazon over the shiny iTunes brand, do you lose anything?

Recently, I ran a survey of first-year students at UNC that tested preferences toward compensation.  The survey offered a dual-tier lottery compensation: Participants were entered to win an iPod touch or their choice of three gift certificates (See [2] for more on dual-tier incentives).  The three gift card choices were iTunes, Amazon, or a popular on-campus cafe, in the amount of ten dollars.  Response to the survey was good, by email-solicitations standards, at 31% (n~1200).  Males were slightly underrepresented, as is commonly the case.

So, what gift cards did my students prefer?  Clearly, the students preferred gift cards to iTunes (n=442) and Amazon (n=442) over the local cafe (n=131).  And we don’t really need any significance tests to see that the difference between iTunes and Amazon is a wash (p=.8406).

When conducting surveys, we’re not always interested in a large homogeneous population.  Sometimes we’re interested in sub-populations, such as certain genders, ages, or ethnicities.    Breaking the perferences out by gender, visual inspection indicates that female students prefer iTunes over Amazon, while male students prefer Amazon over iTunes.  Since neither population comes close to preferring the local cafe, I will focus on the difference between iTunes and Amazon for the rest of the analysis (i.e. drop the people who prefer the Local Cafe).

Of the students that selected Amazon or iTunes, we see that 53% of female students prefer iTunes, 47% Amazon.  Of males, 58% prefer Amazon, 42% iTunes.  The Chi-square test indicates a relationship between gender and preference (p=.001), and within-gender Chi-square goodness of fit tests indicate that while the female student preference difference is insignificant (.0922), the male preference towards Amazon is significant (p=.0064).

To test some higher order interactions, I employed a logistic regression model to test the effects of gender and a few other covariates.  First, since much of my sample is from NC, I tested to see if NC residency might contribute towards a preference.  In this model, gender remained significant, but NC residence was not significant (p=.828).  Next, looked to see if GPA might be a factor in preference.  Gender remained significant, and GPA’s p-value was low (p=.081), but not close to significance (directionality was higher GPA’s towards Amazon).

In the last two models, I looked at ethnicity and age.  In the ethnicity model, gender is significant, and only one ethnicity is significant.  Compared to other ethnicities, students who self-report as Asian demonstrate a preference towards Amazon (OR=.158, p=.000).  With age, gender again remained significant, but 19 year old students (compared to 18 year old students) seem to prefer iTunes (OR 1.49, p=.004).  Notably, a gender by age interaction was not significant, however.

To briefly review, it seems that among my population, the anecdotal preference towards iTunes is just that: anecdotal.  This is good news for me, because it is much more complicated to process iTunes gift cards than Amazon gift cards.  Some final notes: This is not really a proper experiment – such an experiment would use completely randomized solicitation.  Also, the presence of the third category (Local Cafe) is potentially troubling if being a fan of a Local Cafe also correlates to, say, being an iTunes fan or an Amazon fan.  Caveat emptor, blog post, not peer reviewed, etc.

1.  I don’t have a citation for this, but I do monitor to a number of email lists that frequently offer research solicitations.  YMMV.

2. See Li, Kaiwen (2006).  Student Preference for Survey Incentive.  UC Davis Student Affairs Research & Information Tech Report.

Finally, I promise that Amazon has not compensated me in any way, say, by sending me a bunch of gift certificates or a Nikon 12-24mm DX lens or anything like that.