Posts Tagged: identity


6
Apr 09

NY Mag asks “Does Facebook Own You?”

New York Magazine leads with an interesting piece on data ownership and online social networks by Vanessa Grigoriadis.  I’ve got a quote in there, which builds on some writing I did last month.

This is part of who I am now—somebody who knows that her nursery-school tormentor wasn’t a bully without a heart. It will get logged into my profile, and that profile will become part of the “social graph,” which is a map of every known human relationship in the universe. Filling it in is Facebook’s big vision, a typically modest one for Silicon Valley. It’s too complex for a computer scientist to build. Just as our free calls to GOOG-411 helped Google build its voice-recognition technology, we are creating the graph for Facebook, and I’m not sure that we can take ourselves out once we’ve put ourselves on there. We have changed the nature of the graph by our very presence, which facilitates connections between our disparate groups of friends, who now know each other. “If you leave Facebook, you can remove data objects, like photographs, but it’s a complete impossibility that you can control all of your data,” says Fred Stutzman, a teaching fellow studying social networks at the University of North Carolina at Chapel Hill. “Facebook can’t promise it, and no one can promise it. You can’t remove yourself from the site because the site has, essentially, been shaped by you.”

Check the full article.


2
Jul 08

Data Portability

From the Los Angeles Times, a particularly chilling story about social websites and third-party data:

Jane Yang, a 30-year-old marketing coordinator, was curious the other day to see what would turn up if she searched for herself on Reunion.com, a Los Angeles-based social networking site.

Sure enough, there was her name, which didn’t bother the Oregon resident all that much. Nor was she particularly troubled that her husband’s name was included under her “Friends & Family.”

What did startle Yang was seeing the name of her 4-year-old son.

What happened?

Jeff Tinsley, Reunion.com’s chief executive, said the company recently purchased records on millions of people from a data broker. But he said the broker, which he declined to identify, was instructed not to include anyone under 18.

“We have no idea how this happened,” Tinsley said.

Buried in the terms of service and privacy policies of many sites are such third-party data collection agreements. For example, Facebook’s privacy policy states “Facebook may also collect information about you from other sources, such as newspapers, blogs, instant messaging services…” Rapleaf, an upstart third-party data vendor, promises to “find information about people on the social web, on behalf of businesses and consumers.”

Information leakages, such as the one discussed in the LAT article, provide insight into the scope of third-party data collection operations. Amassing data from public and private sources, these databases correlate identities based on facets such as names, birthdates and location. Unlike credit or background databases, there appears to be no special regulation of these archives. Perhaps that will change, the more we’re confronted with our information.


8
Feb 08

The subjective computer has found us

For the past few days, I’ve been thinking about the information products and byproducts of social computing. Products may be thought of as things we create with intent; our Facebook profile, our home page. Byproducts, respectively, are the things we create with limited intent; our attention data, the traces we leave in server logs, the software products that appropriate our agency.

From a volume standpoint, the amount of data byproducts we produce significantly outweigh our pure data products. Maybe we’ve got 15 profiles on social networks, but Google’s got gigs of our email, search logs, and click streams. Following Irwin Altman’s notion of privacy as boundaries, its easy to see how we delineate between these two data sets, even though they’re identical at the binary level: one we see, and one we don’t.

At SGFoo, I participated in a number of discussion around data byproducts and the social graph. Leveraging your explicit connections (a data product) and attention or network data (byproducts), service providers could expose all sorts of novel information to you. I tend to agree; the jumble of connections and intentions and algorithms can likely tell me all sorts of new and interesting things.

In a post danah boyd wrote a few days ago, she cautioned against where such objectively computational approaches lead us, that the negative effects of such systems may outweigh the perceived gain. I tend to agree; the leaders of the social computing space possess an alarming antipathy towards privacy, especially when weighed against the benefits of derived, latent knowledge. Of course, this is the ideology of Google or competitors; in the graph, we’re all just documents with linkages, our behaviors subject to Map Reduce. The privacy advocate stands in the way of progress, the natural state of industry.

Drawing back to the initial distinction I posed, the product and byproduct, I wonder if there isn’t a self-regulation implicit in the system. Perhaps norms other cultural processes will make taboo the “reveal” implicit in surfacing computed data byproducts. It’s creepy when a computer tries to figure you out, it’s creepier when a computer tries to figure you and your friends out, and perhaps the creepiness of all of this makes leveraging such knowledge in social processes taboo. We may be able to compute it, but we may not actually want the information because the objective boundary is crossed.

In 1996 Sherry Turkle proposed that we were looking for the subjective computer, one that became a place of identity reflection and expansion. At the time, it was alarming to think of a computer to which we bared our souls. Of course, 1996 was a different time for computers: we weren’t hyperconnected, massive data stores like Google were nascent, the notion of sharing one’s real identity online was anything but pervasive. These conditions established a sense of mastery over what one was sharing; the computer could become your second self because, well, you didn’t have to worry about a creepy Facebook app sharing your deep political opinions with your friends without your knowledge.

Do we still seek the subjective computer? I’d argue that, in 2008, the subjective computer seeks us. Since Turkle wrote Life on the Screen, we’ve placed much emphasis on using objective measures to uncover subjective knowledge. Rather than the computer being the device you pour your heart out to, it has become an intelligent proxy. At the same time, there no longer exists the monolith computer; the computer is simply the networked device, routing you to the best places for disclosure and community.

In 2008, we find ourselves in a unique situation where the things we say, and the things we don’t say become central parts of our computer disclosure. It’s no longer simply about our blog post, it’s about who we’ve looked at or talked to. Our machines have frameworks for computing both the intentful and ephemeral things we disclose, our data products and byproducts.

Where does this leave us? When we reached out to the subjective computer, it was a powerful tool that one could master and appropriate for specific purposes. Social interaction, identity play – these were affordances of the device. Now computers master us, leveraging our data to fit us into modeled interactions, exercising tremendous power through selective disclosure, and offering us freedom through a participation process that is essentially repressive.

As I alluded earlier, it is unlikely that we’ll ever become comfortable with the spaces of complete disclosure. There’s always going to be a difference between our shared and mined data, and there will always be social rules standing in the way of leveraging data a person or system has collected about another. This is not to say that the boundaries won’t be tested, or that they aren’t already stretched to frightening levels. Beacon didn’t work because we were uncomfortable with the removal of boundaries, and I’d argue that we’re going to continue to feel this way in similar situations.

It is now time to push back against the devices and networks that seek to master us. It is time to return to places where we exert control, where our data isn’t an asset, and where our mastery over the device sets us free. Horribly naive? Perhaps, but I also might be right. The arms race of analytics may fail simply because we’re not comfortable with the “reveal”. The true loss here, however, is the sense of freedom we once had when the subjective computer was our agent. As we now live in fear of the computer, we’ve lost the ability to seek freedom in it; I think one day we’ll want that back.


7
Feb 08

Facebook API Data Sharing

Via Slashdot, news that the Facebook Platform is falling under increased scrutiny for questionable privacy practices. The issue at hand is developer access to profile information as shared via the API. I’ll see if I can provide a high-level overview.

When you add a Facebook application, you allow the application developers access to your profile. Your profile information is queryable via the Facebook platform API. This means that the data in your profile is passed to application developers via structured methods. An example of such a method is Users.getinfo. If you’ve added an application, the developer can make a Users.getinfo call with your Facebook ID. In response to that call Facebook sends the developer the information from your profile – your name, networks, favorite books and movies, etc. Other calls such as photos.get and friends.get make your photos or friends lists queryable by application developers.

Just so we’re clear, Facebook sends your information only to third parties that you’ve approved (you read the terms of service, right?). It is as if the third party was able to view and save your profile, photos or friends lists. To prevent problems, Facebook regulates third-party behavior through its developer terms of service. The terms of service states that only certain types of your profile data are storable; if the developer possesses (i.e. downloads) data that is not explicitly storable, they agree to delete this information within 24 hours. That is, the company must, under the terms of service agreement, expunge the data that is not storable within a day of collecting it.

Notably, the storable data is very limited. You may store a user ID, or a photo ID, but you may not store a name, favorite book or picture. The only mechanism that regulates this is the terms of service agreement; if a company decides to store the data longer than 24 hours, there’s no technical or DRM-type mechanisms that will enforce data destruction. The privacy equation relies only on good faith between Facebook and the third party.

Facebook has relied on this storage agreement since the beginning of the API. The reason we’re hearing of it today is due to a recent study that found that Facebook applications don’t need as much information as they’re being given. There are clearly larger questions, especially when one considers the scale of Facebook applications. The largest applications have over 2 million daily users. They almost certainly have install bases in the tens of millions. This means that theoretically, tens of millions of profiles could have been downloaded and stored, in violation of the terms of service.

What are the incentives for storing profile information? As a researcher, I can think of hundreds of reasons. Using a small set of 100,000 profiles from across the US (a small application), one could build a valuable marketing database. Even if personally identifiable data was removed from the set, I’d still be able to get great value from the set using probabilistic techniques.

The reality? Likely, most of the applications you’ve added haven’t stored your profile data in violation of the terms of service. Certainly, an app storing your data couldn’t do anything above-board with it (Facebook would quickly and successfully sue). But in reality? With backup tapes, less-than-ethical application developers, or even those who just fail to read the terms of service – yes, it’s likely that some data is stored somewhere. Just as your profile is probably in a browser cache somewhere, it’s likely an app or two has stored your info. Will it be used against you? Will you become part of a black-market database? Who knows.

Now that people are taking a look at the privacy assumptions of the Facebook platform, perhaps its time to start a dialogue around how to solve the problems of SNS API’s. OAuth is one heckuva step forward. However, with the power application developers exert in the Facebook ecosystem, I won’t hold my breath that the all-you-can-eat data stream is going to be turned off any time soon.


3
Jan 08

News Round-up

Happy New Year to all – I’ve had a nice break and its good to be back to work/writing/etc. For my first post of the new year I thought I might share a few stories that have come my way.

Finally, I’ve talked to a few people today about the impersonation of Bilawal Bhutto Zardar. Reporters have been long turning to social networks for news stories, often for supporting or illustrative information presented as fact, simply because a profile looks real. This is a journalistic gray area, and I enjoyed the opinion of Barbara Friedman and Meredith Golden in Sunday’s N&O.

Is a social network profile fact because it is a nexus of activity? If it appears real, is it real? This is a core problem with online identity and our more offline notions of fixity. Is a profile about me even if I didn’t create it (but say, I’m mourned there?) And what are the editorial standards in reporting content from SNS, where identity is ambiguous at best? I’m very surprised that the Bhutto story made it by an editor; to me, that is a journalistic failure and should not be explained away as a “ruse.”


5
Dec 07

Facebook’s New World

This afternoon, Facebook issued a mea culpa and reversed its position on Beacon, making it an opt-in system and adding a global opt out. While this reversal does not address user tracking on third-party sites, it is a positive step for privacy and likely front-page news tomorrow.

The Beacon controversy has been particularly interesting to me because, at its heart, it wasn’t about the users. By comparison, let’s consider the previous major controversy, Newsfeed. Newsfeed was implemented without a soft rollout, without much notice, and most importantly, without privacy considerations. Facebook argued that their position was sound because nothing quantiatively changed with regards to privacy; they failed to realize that privacy is both qualitative and quantitiative. Newsfeed made Facebook, and Facebook friendship feel very different to us, and users reacted en masse. As I’ve documented in the case study, blog coverage contributed to the uproar, but the large feedback vector was a group called “Students Against Facebook News Feed.” At the group’s maximum size, it had over 750,000 users, somewhere around 8% of Facebook’s entire userbase.

This user revolt was quickly addressed by Facebook, leading to a more-or-less agreeable conclusion within a few days. At the time, Facebook was not open-to-all, and while it was covered in the press, it certainly wasn’t the SNS on the tip of everyone’s tongue. In the Newsfeed fiasco, Facebook’s constituency was the agent of change; the event largely registered with Facebook users and watchers, but not the general public.

Fast forward a year and two months, and Facebook again finds itself rolling out a feature with questionable privacy assumptions, Beacon. Beacon is a little different from Newsfeed, though. While Newsfeed was in your face, forcing you to confront privacy issues, Beacon is subtle – to the point that many Facebook users don’t even know it exists. Why? Well, first, you have to use a Beacon enabled site to encounter Beacon, and second, its quite hard to notice Beacon ads as anything special or different from all the application spam in your Newsfeed. I’d argue that a majority of Facebook users don’t know about Beacon, just as most car owners don’t know what their car’s ECM does. It’s not a value judgment, just a reality of technical systems.

In creating Beacon, a product that would clearly fly beneath the radar of a majority of users, Facebook assumed that it could use its bully pulpit to address serious privacy changes. “Use it or leave”, etc. We see these assumptions enacted in the opt-out nature of the system, with no global exclusions. And if the users weren’t even going to really understand the changes, they couldn’t revolt, right? To a certain extnt, Facebook’s users didn’t revolt. There wasn’t a zero-day event like “Students Against Facebook Newsfeed.” There wasn’t viral opposition from users, mass defections, or any other major user-generated protest that appeared on my radar.

Where Facebook tripped up was forgetting that they’re no longer just accountable to their users. Over the past fourteen months, Facebook has morphed from a college students’ website to (in the eyes of the media) a competitor to Google or Microsoft. And while I think that even internally at Facebook they don’t buy that hype, the press and the social/technical blogosphere has made the company item one on their watchlist. In self-fulfilling their 15B prophecy and promoting their ability to change the media, Facebook invited the criticism and scrutiny that comes with such a lofty place. The media spectacle that has been Facebook’s last few months works both ways, it seems.

In arguing that the user wasn’t the agent of change, my main piece of evidence is the MoveOn Campaign. Reaction to MoveOn’s opportunistic petition drive was paltry, at the time of writing only 70,000 Facebook (.14% of FB) users have joined MoveOn’s group. What MoveOn lacked in response, however, they more than made up with media savvy. Through my work with techPresident (and seeing reposts around the web), I was able to see many of the messages that MoveOn sent to its vast contact network of reporters and media influencers. Each message was full of information, comparable coverage, easy-to-soundbite narratives (i.e. Facebook ruined Christmas) and opportunities to interview complainants, etc. MoveOn pushed the issue hard, pounced on new developments, and kept this story alive and in the media.

Couple of caveats. First and foremost, this isn’t all MoveOn’s doing. This was a legitimate story, and many covered it as such. Second, once a story like this gets going, it picks up a life of its own. I will say, however, were it not for the MoveOn campaign, we wouldn’t be where we are right now. Their media strategy simply ran an end-around Facebook’s proposed plan of action in dealing with angry users or A-list bloggers. They were blindsided by all of the media coverage.

As Facebook apologizes for Beacon, one can sense their disappointment in not being able to push Beacon they way they intended. Their product was crafted to take advantage of unsuspecting users, and to that extent they pulled their strategy off pretty well. Perhaps this was their major error; rather than dealing with angry users, they were forced to deal with the media. Through their own machinations, Facebook no longer exists in a world where it can bully users without consequence; as they attempt to keep up appearances of a major company, they will be forced to adopt a front of responsibility. The media is now Facebook’s watchdog, and because of that, Facebook’s in a very new world.


30
Nov 07

Facebook rethinks Beacon

As reported in various blog and print sources, Facebook has announced changes to Beacon, the controversial ad program. According to the reports, there will be a change to the story posting flow, requiring users to approve a story before it is sent to the Newsfeed. This does address some of the concerns regarding information leaks through Beacon.

In a nutshell, when a user on a third-party site sets off a Beacon action, they will be presented with the popup. If the users does nothing, the story will be sent to a queue, rather than to the Facebook. The next time a user sets off a Beacon action, they will be presented with a list of stories to send to Facebook, and can select or reject stories as they deem appropriate. Facebook will also make more clear the posting flow, promising prominent notifications when one logs in and is presented stories to approve.

Notably, there is no mention of a global opt-out, which I believe is a mistake. One of the critical problems with Beacon is it breaks boundaries of privacy between sites, and Facebook provides no apparatus for restoring the privacy. As a result, cookie-based pageview tracking will also continue to occur.

While the response to MoveOn’s call has been tepid – 50,000 signees, the response to Facebook Beacon is still coming. Beacon isn’t evenly distributed around the web; one may not use Fandango or Epicurious or read Techcrunch, meaning there are a lot of Facebook users out there still waiting to step on these Beacon privacy landmines. This is a distinctly different situation from Newsfeed, which was extremely direct. This story will evolve; it will be more of a rolling problem.

In other quick news, tomorrow’s Virtual Citizenship and New Technologies Symposium will be broadcast into Second Life. My talk is at 9:30AM (Eastern) if you’re interested, but I’d really recommend you checking out the talks of my very esteemed fellow presenters. If the excellent conversation we had at dinner is any indication of what to expect tomorrow, it will be worth your while. Full instructions for the Second Life simulcast on the Symposium website.