February, 2008


7
Feb 08

Facebook API Data Sharing

Via Slashdot, news that the Facebook Platform is falling under increased scrutiny for questionable privacy practices. The issue at hand is developer access to profile information as shared via the API. I’ll see if I can provide a high-level overview.

When you add a Facebook application, you allow the application developers access to your profile. Your profile information is queryable via the Facebook platform API. This means that the data in your profile is passed to application developers via structured methods. An example of such a method is Users.getinfo. If you’ve added an application, the developer can make a Users.getinfo call with your Facebook ID. In response to that call Facebook sends the developer the information from your profile – your name, networks, favorite books and movies, etc. Other calls such as photos.get and friends.get make your photos or friends lists queryable by application developers.

Just so we’re clear, Facebook sends your information only to third parties that you’ve approved (you read the terms of service, right?). It is as if the third party was able to view and save your profile, photos or friends lists. To prevent problems, Facebook regulates third-party behavior through its developer terms of service. The terms of service states that only certain types of your profile data are storable; if the developer possesses (i.e. downloads) data that is not explicitly storable, they agree to delete this information within 24 hours. That is, the company must, under the terms of service agreement, expunge the data that is not storable within a day of collecting it.

Notably, the storable data is very limited. You may store a user ID, or a photo ID, but you may not store a name, favorite book or picture. The only mechanism that regulates this is the terms of service agreement; if a company decides to store the data longer than 24 hours, there’s no technical or DRM-type mechanisms that will enforce data destruction. The privacy equation relies only on good faith between Facebook and the third party.

Facebook has relied on this storage agreement since the beginning of the API. The reason we’re hearing of it today is due to a recent study that found that Facebook applications don’t need as much information as they’re being given. There are clearly larger questions, especially when one considers the scale of Facebook applications. The largest applications have over 2 million daily users. They almost certainly have install bases in the tens of millions. This means that theoretically, tens of millions of profiles could have been downloaded and stored, in violation of the terms of service.

What are the incentives for storing profile information? As a researcher, I can think of hundreds of reasons. Using a small set of 100,000 profiles from across the US (a small application), one could build a valuable marketing database. Even if personally identifiable data was removed from the set, I’d still be able to get great value from the set using probabilistic techniques.

The reality? Likely, most of the applications you’ve added haven’t stored your profile data in violation of the terms of service. Certainly, an app storing your data couldn’t do anything above-board with it (Facebook would quickly and successfully sue). But in reality? With backup tapes, less-than-ethical application developers, or even those who just fail to read the terms of service – yes, it’s likely that some data is stored somewhere. Just as your profile is probably in a browser cache somewhere, it’s likely an app or two has stored your info. Will it be used against you? Will you become part of a black-market database? Who knows.

Now that people are taking a look at the privacy assumptions of the Facebook platform, perhaps its time to start a dialogue around how to solve the problems of SNS API’s. OAuth is one heckuva step forward. However, with the power application developers exert in the Facebook ecosystem, I won’t hold my breath that the all-you-can-eat data stream is going to be turned off any time soon.


7
Feb 08

Major steps forward for OpenID

There’s big news from the OpenID foundation today: Google, IBM, Microsoft, VeriSign, and Yahoo! have joined the foundation’s board. This is obviously a major step forward for OpenID, but it’s also good for the entire open identity movement; the major players are seeing the value in consumer choice and control. At ClaimID, we’ve been advancing these themes since 2005, so it’s especially rewarding to see this news. From the OpenID foundation announcement:

By bringing on these companies and their resources, the OpenID Foundation will now be able to better serve the needs of the entire OpenID community. In 2008, we can expect to see a larger focus on making OpenID even more accessible to a mainstream audience, the development of a World-wide trademark usage policy (much like the Jabber Foundation and Mozilla have done), and a larger international focus on working with the OpenID communities in Asia and Europe. Awesome!

Congratulations goes out to OpenID foundation chairman Scott Kveton, and all others involved in the foundation who’ve worked on this initiative. Scott’s blogged the coverage of the announcment if you’d like some more insight. Again, congrats to the OpenID foundation for this huge achievement – today is a very big day for OpenID and open identity work.

Cross-posted to the ClaimID blog.


6
Feb 08

The Future of Social Software

Last weekend, I spent a few days at O’Reilly HQ for the Social Graph Foo Camp. This was a very interesting experience; I was challenged as both a researcher and practitioner. What I saw made me very hopeful – people agreeing on methods and protocols, solving real problems. Realistically speaking, a camp like SGFoo (or IIW) pushes this work ahead 6 months in the span of just a few days. It’s hard to understate the power of connections, conversations, late nights and lots of coffee and Red Bull.

As it happens, before I went to SGFoo I’d been reading a bunch of stuff on qualitative research methods. Methods books, cases, studies….my brain was very keyed-in to a type of observation that is almost annoyingly analytical. It was hard to shake this perspective as I participated in discussions this weekend. It’s certainly informed some of the thought I’ll share today.

Watching the discussions last weekend was a little like watching the future of social software unfold in realtime. Granted, market leaders will continue to be the vanguard of the movement, but the pathways and patterns these companies will use were the crux of the discussion at SGF. There were a number of advocates for the human perspective and user studies, but the real emphasis was on fast development, prototyping, and seeing what works in the wild. This particular approach has been the hallmark of Web 2.0 development strategies, and I doubt we’re going back any time soon.

Yesterday, danah boyd wrote an interesting piece entitled “just because we can, doesn’t mean we should.” In it, boyd challenges the assumptions of privacy and audience that go into the design of social software; that the desire live publicly is a notion of privilege, available to a select few. It’s hard to disagree. The ideologies that inform Beacon or the initial News Feed are hardly mass-market, and there are countless other exemplars out there.

As a relative outsider to the Valley scene, I found myself being challenged by the assumptions of these new technologies. For a simple example, consider a portable friends list. The idea of a portable friends list is when you sign on to a new service, you can upload or authorize your friends list, and find all of your friends who use that service. Theoretically, this vast, barren new space becomes a rich, social space with the click of a button.

Stepping back for a second, let’s consider the assumptions of this technology. As we’ve seen with Facebook, our networks grow to be very large, a collection of “friends” of varying tie strengths and varying contexts (work, school, family, etc). Furthermore, the process of joining a new social community is one of boundary negotiation and sense-making. That is, you’ve got to learn to crawl before you walk; norms and acceptable behaviors are negotiated over time. When someone signs on to Twitter, friends everyone, and then dumps all their RSS feeds into Twitter, you cringe. They haven’t figured out the norms. Now imagine that, every time you sign on to a new service, you’re forced to learn the norms in realtime, in front of an audience of hundreds of your friends.

The problem is that these assumptions actually aren’t problems in Silicon Valley. If your day job is to design social software, it’s likely you’ve internalized the rules of community, you’re a native. Even if you didn’t know Twitter, you’d figure that dumping your RSS streams into Twitter would be bad form, unless you saw everyone else doing it. The social software power user can easily move between sites; she is also incentivized to discover and master new communities.

With regards to friend networks in the Valley, there’s incredible density in work-friend networks, and likely even family networks. In the Valley, you want to be friends with coworkers, competitors, famous-types; your network is a proxy of your stature. Finding everyone you know on a site is a means to a primarily economic, secondarily social end. Of course, this is hardly a Valley-only phenomenon, but the difference is these assumptions are being written into software for all of us.

This post shouldn’t be taken as an attack on technology or the work anyone is doing; it is good work and it will go forward. Rather, this post should challenge the implementer to look critically upon the assumptions that go into the technology being implemented. Rather than making your average user add a friend list on day one (to increase your userbase), make the addition of users a game in which the user selects the context appropriate friends and learns the norms of the systems. Think about Facebook before and after they introduced privacy to NewsFeeds; such a simple change in assumptions can vastly affect perceptions and experience.

The work showcased at SGF represents the future of mediated social interaction, even if only in the rules, pragmas and assumptions. One thing is clear: This stuff ain’t going away, and it ain’t just for Valley-types anymore. I would argue that research, testing and social thought complement Web 2.0 development models, and perhaps they offer us a way forward as this stuff goes mainstream. These are exciting times.