Thoughts


20
Apr 10

Announcing Freedom for Windows

I’m very pleased to announce that Freedom, my internet-blocking productivity software, is now available for Windows!

Over the past two years, countless people have written to me, asking if there is a version of Freedom for Windows.  I hated telling people that they couldn’t have Freedom.  I’m happy to report that if you’ve got a Windows XP, Vista, or 7 computer, you too can now experience Freedom.

Want to know a little more about Freedom?  Read about it in the New York Times Magazine, Salon.com, USA Today, Chronicle of Higher Education, LifeHacker, and others.  I’m also quite partial to the recent article on Freedom in the Guardian that starts: “With the help of a lovely man called Fred, I’m no longer in thrall to SamCam’s cape and Guido Fawkes.”

Let me know what you think!


19
Apr 10

Privacy in Social Software

Last week, I wrote a number of essays critical of Twitter’s decision to provide a collection of public Tweets to the Library of Congress for permanent archiving.  I argued that by taking user data and putting it into a public archive, Twitter had meaningfully restricted the privacy rights of users.  Some of you agreed with my position, many didn’t; but all who commented or wrote to me helped shape my thinking.  In this post, I want to provide a little more context on the nature of privacy in systems like Twitter.

Last week, I gave a talk on the dynamics of privacy in Facebook.  In the research, we modeled a behavior that is increasingly pervasive in Facebook: having a friends-only profile.  I want to draw attention to one slide from the talk:

In this slide, the two slopes you see are the growth of Facebook, and the proportion of UNC undergraduates with friends-only profiles.  Now, the data are on different axes, and Excel is fitting the lines, but the trend is meaningful.  With growth in the service we see a correlated turn towards privacy.

While the pattern I observe is only general to Facebook at UNC, other researchers have observed similar patterns of privacy behavior in other social software.  For example, as Friendster scaled,

[S]o too did the diversity of the social networks represented. A growing portion of participants found themselves simultaneously negotiating multiple social groups—social and professional circles, side interests, and so on.  (boyd, 2007)

With the increasing complexity of diverse audiences, individuals turned to a range of strategies to manage their privacy: multiple accounts, limiting disclosure, or simply dropping out of the service.  Regarding Myspace, Caverlee and Webb (2008) reported (bold is mine):

Overall, the fraction of private profiles is increasing with time, indicating that new adopters of social networks may be more attuned to the inherent privacy risks of adopting a public Web presence. We find that women favor private profiles 2-to-1 over men, and that (perhaps, counter-intuitively) younger users are more likely to adopt a private profile than older users. We also find that the more connected a user is in the social network, the more likely she is to adopt a private profile.

And now in Facebook, our research finds a similar movement towards privacy as the service grows and networks diversify.  One can only suspect that Facebook’s recent “privacy upgrades” and changes to the terms of service prohibiting privacy of certain information has something to do with this normative shift.

Looking at the data across systems, I’d like to speculate that there’s a general property at work.  In a social software system, as the system grows and diversity of networks increases, so does utilization of privacy.  Here’s a graph I’ve constructed illustrating the trend (larger version):

The slope is purposefully convex. In the early stages of adoption, network use is sparse, so individuals are incentivized to lower privacy, to increase the odds of finding others. As time passes and the service grows, individuals form dense, small-world clusters. At this stage, individuals are mainly connected to one another within one context, and there are minimal bridges between contexts. Therefore, individuals can afford to keep privacy low, due to minimal risk of inadvertent sharing across context. As the system expands, however, we see a turn back towards privacy as an increasing number of bridges across context are created. In this moment of context collapse, individuals erect barriers of privacy to facilitate continued disclosure.  Here’s a closer look at the (simulated) networks:

By linking privacy to context collapse, I argue that mobilization towards privacy is largely a function of perceived audiences (and harms).  This distinction is important because it holds privacy attitudes constant.  Research, both mine and by others, has demonstrated that privacy attitudes do not necessarily predict privacy behaviors.  Awareness of privacy-in-context is actually the key variable causing the dynamic shift towards privacy in social software systems.

Let’s return our attention to Twitter.  What does your Twitter network look like?  If you’re an average user, your network probably contains a few offline friends (many, many fewer than Facebook or Myspace) and some celebrities (your definition may vary).  There may also be a few friends you’ve made on Twitter, who you don’t know offline.  Chances are, the average Twitter user’s network looks like the sparse “Early Adopter” or “Small World” network.

We see evidence in cultural practice that users have sparse networks in Twitter.  Going back to my notes on Alice Marwick’s AOIR ’09 talk, the culture of celebrity serves a very functional purpose for Twitterers with sparse networks, who wish to connect out of  limiting contexts.  “Talking” to celebrities (and finding others who talk to the celebs you talk to) is a way of escaping one’s sparse world, finding new people to follow in a known context.  Hashtag culture provides further evidence that individuals are trying to talk “across” or “out” of limited contexts.  If your network is sparse, turning to site-level anchors like hashtags and celebs provides a reliable stream of conversation in networks where conversation is lacking due to structural impediments.

I wonder how long these practices will need to continue.  Just the other day, Twitter announced that 100 million people had created accounts.  You can’t turn the news on without hearing about Twitter.  A large group of people, primed on social software by Facebook, are waiting to join Twitter.  And over the next year or two, they will, raising issues of context collapse, and prompting a turn toward increased privacy among early adopters.

My major problem with the Twitter/LoC agreement is that the people who will be confronted with context collapse and a growing need for privacy have lost meaningful recourse.  As I argued in my last post, it becomes impossible to take back what you’ve shared, a real and useful privacy strategy.  You’ll still be able to make your account private, but it seems there’s little you can do about the Tweets you sent that were archived permanently in the Library of Congress.

Why is this bad?  Let’s consider a hypothetical.  In 2007, Myspace had 100 million users.  Myspace was growing fast, with many users signing on for the first time.  Myspace users had two options for privacy: public or friends-only.  And a lot more people had public profiles in 2007 then they do today.  How would we feel, now, if Myspace had given all of its public profiles to the Library of Congress for permanent archive in 2007? I can only guess that a bunch of people who had public profiles in 2007 might feel a little uncomfortable about it (cue the “it’s their own damn fault” chorus).

I guess I should feel relief that if Twitter is going to do this to users, at least they are partnering with the LoC (an admirable entity).  But, in reading what LoC staff is saying about this effort, I’m not comforted.  Of the dataset, LoC Blogger Matt Raymond writes “I’m certain we’ll learn things that none of us now can even possibly conceive.” National Archivist David Ferriero writes “What will historians be able to glean from our tweets?  We can’t be sure, but it will probably be very interesting” (while also stating “Twitter is not for everyone. If you are anything like me, you don’t really care what someone had for breakfast.”)  It strikes me that the Twitter archive is being treated like a novelty, promising to be an amazing treasure trove when new research methods are developed.

Maybe it’s all these years of running t-tests (developed 1908), but I’m skeptical that these Tweets are going to tell us something that we can’t quite imagine.  Robust methods develop slowly, and are validated over time.  We’ll probably still be doing text mining, linguistic and sentiment analysis, and content analysis 50 years from now.  One area that is improving rapidly, however, is the identification of individuals in large data sets.  The Netflix dataset was identified by Narayanan and Shmatikov.  Acquisti and Gross demonstrated they were able to guess people’s social security numbers from public data.  And old-fashion detective work by Michael Zimmer identified the T3 Facebook dataset.  Of the future, we know this: It will be easier to connect you to your archived Twitter identity.

So here’s the thing.  Why won’t Twitter make the archiving a simple, opt-in process?  Or at least allow people to opt out?  Twitter obviously knows that giving user data to a permanent archive is different from sharing an API or allowing a Google spider – they wouldn’t have approached the LoC if this wasn’t the case.  I may be the only voice shouting about this, but this is a big, watershed moment regarding user privacy.  EFF, EPIC, Facebook watchdogs – where are you?  Let’s work with Twitter and make this right.


16
Apr 10

Is it time to cancel your Twitter account?

I was pleased to see that my last post on Twitter and the LoC generated excellent discussion both here in the comments and over in Twitter.   I’ve seen some great defenses of the deal, but unfortunately I’m not buyin’ quite yet.  I thought I’d use this post to quickly raise a few more questions and concerns.

First, a quick review of some of the conversation about the dealZimmer is all over it, raising a number of great open questions, and exloring how private tweets just might end up in the LoC’s archive.  The Atlantic has rounded up opinions, particularly an interesting conversation going on at The Big Money.  Also notable is a BBC interview with Twitter’s general counsel, though it skips over privacy issues.  Now that I think of it, skipping over privacy issues might be the theme of this essay.

One of the central problems with this deal are the set of assumptions around public Tweets.  Particularly, because the Tweets are “already public“, individuals lose all rights to the content.  In my last post, I drew explored some ways in which content shared in public actually wasn’t public content.  For example, practically obscure public content that is meant for a select audience.  In this post, I want to challenge another assumption that people make about public content: that it lives forever.

If there’s one thing that social media has taught us, it is that if you post anything to the web, it stays there forever.  Of course, this is empirically false.  Companies go out of business, databases corrupt, servers crash, indexes get expunged, identifiers get mixed up, and even with the best intentions and good backups, data are lost.  Think about the Google search results for your name.  Are they the same they were 1, 3, or 5 years ago?  While it is likely that you could tell me tons about new results that have come online over that time period, could you tell me about the ones that have gone offline?

So let’s just take a second and put the assumption that the internet is a giant cache to bed.  The internet is dynamic, fragile, and designed to lose things.  The internet has probably forgotten more about you than it remembers.  The next question generally brought up is “What about Google!”  If you want an answer to that question, send out a Tweet and then delete it.  Wait a few days and search for it.  The Tweet is gone, because Google isn’t in the business of sending you to 404′s.  Thank the market for that one.  After we knock down the Google straw man, the next assumption generally covers the suspicious “other” person who is stalking you and creating a giant portfolio of everything you do.  I hate to pop everyone’s bubble, but unless you’re a really, really significant public figure, this person doesn’t exist for you.

So why is it that we all assume that the content we share publicly will be around forever?  I think this is a classic case of selection on the dependent variable.  When we Google ourselves, we are confronted with what’s there as opposed to what’s not there.  The stuff that goes away gets forgotten, and we concentrate on things that we see or remember (like a persistent page about us that we don’t like).  In reality, our online identities decay, decay being a stochastic process.  The internet is actually quite bad at remembering.

The Library of Congress, on the other hand, is quite good at remembering.  Magnificently good at it, most likely the best in the world.  And that is what’s troubling.  Up until Twitter sent its archives over to the Library of Congress, Twitter users could realistically expect they could make things go away.  They could delete Tweets.  They could change their account name.  They could remove their account.  Without consulting their users, privacy advocates, rights organizations, or any other voices of reason, Twitter has summarily taken these very real privacy remedies away from their users.

This gets me to what is so frustrating about Twitter’s move: a frighteningly cavalier attitude towards shipping around the data of tens of millions of consumers.  Twitter has literally passed the personal information of millions of users to a permanent, public archive without so much as pre-notification, consultation, or the opportunity for debate.  And while even though it appears legal for the LoC to have the data, big questions remain regarding whether Twitter has actually violated its own contract with users.  How can I meaningfully own my content after it has been shipped to a government archive?

In all my years of using Twitter, the idea of canceling my account has never even vaguely crossed my mind.  Until last Wednesday, that is.

Update: American Prospect has a great interview with Martha Anderson of the Library of Congress.  Regarding the deal:

The agreement has been signed, but we still have a lot of technical details to work out — how we’ll technically transfer it, and when.

Regarding opt-out:

You know, I don’t know. I think that’s a question for Twitter. There’s several questions about that which they are still working out. We asked them to deal with the users; the library doesn’t want to mediate that.

Regarding user information:

I think that’s one of the big issues for us to understand in terms of privacy. And there’s a lot of work going on, especially over at [the National Institutes of Health] about how to anonymize data and still make it useful. We’re really big on partnering with people to learn what they’re learning, so I think that’s an area we’ll look into. In serving it, what can we do to make it useful to research but not identify personal information?


14
Apr 10

Twitter and the Library of Congress

I’m currently at the CHI conference, which is commanding all of my attention, but the news about Twitter and the Library of Congress is too big to ignore (see also Zimmer, RWW).  Quoting the LoC:

Have you ever sent out a “tweet” on the popular Twitter social media service? Congratulations: Your 140 characters or less will now be housed in the Library of Congress.

According to Biz Stone, Twitter will begin transferring all of their public tweets, after a six-month embargo, to a permanent, public archive at the Library of Congress.  Let me say something (probably) unpopular: I’m a little horrified.

If you talk to people about things shared online, you generally run into two assumptions.  The first is that things shared publicly are meant for the general public.  The second is that things shared publicly are meant for posterity.  Both of these assumptions are dangerous.  Some of my recent work has identified that people do share privately in public, and that individuals do engage in the grooming (i.e. removal) of content shared publicly.  danah’s found this.  So have lots of others.  If there’s anything we should know by now about social media, is that a deterministic, one-size-fits-all approach to privacy is a bad approach to privacy.

This is what makes Twitter’s “gift” troubling.  It assumes that all content shared publicly is truly public and for posterity.  Let’s consider some edge cases.  Bob has two Twitter accounts, one for work and a personal account.  Both are public, but the only way people find out about his personal account is that he tells people the obscure handle.  Bob wants to be practically obscure – private in public – without going to all the trouble of setting up complicated privacy controls.  So what happens, two years from now, when Bob accidentally discloses his handle in the wrong context, and he needs to remove some Tweets?

There’s probably a certain class of reader that looks at Bob and says, well, Bob’s out of luck.  There’s Google cache and third party tools and a whole host of other ways tweets are preserved.  The difference I’d argue is that these tools have certain properties – they react to API calls, they decay, etc. – that make them qualitatively different from a professionally managed archive.  Through the creation of a permanent, public, third-party archive, Twitter changes the privacy-management strategies that are going to be available to users in the future.  This is critical, because if Bob can’t trust his down-the-road privacy management strategy, Bob might share less today.

This is a great opportunity to plug the work of Helen Nissenbaum, whose most recent book Privacy in Context extends the argument for privacy as contextual integrity.  Nissenbaum argues that disclosures have contextual expectations, and that shifting these expectations constitutes a meaningful violation of privacy and freedom.  Even though the tweets are public, it is a fallacy to assume that digital content shared in public was created with an understanding that the content would end up in a third-party, government-managed archive.  Facebook’s helped us demonstrate again and again that privacy is both qualitative and quantitative.

Practically, there are some questions that Twitter needs to address about this move.  First, Twitter’s terms of service specifies that:

You retain your rights to any Content you submit, post or display on or through the Services. By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).

The way I read this is that as long as your content is on Twitter, Twitter can do what they want with it.  Fine.  But what if you remove your content from Twitter?  Wouldn’t Twitter’s licensing of your content to the LoC also expire?  Twitter needs to address exactly how we can pull our content out of the archive when we want.  Michael Zimmer thinks that Twitter users won’t have the ability to remove tweets from LoC, so how will Twitter rectify this in the terms of service?

A broader question is why Twitter didn’t just build this as an opt-in service.  Or even, less preferably, an opt-out service.  Is the collection so important that it is worth compromising user privacy?  I’ve got a feeling that there are certain assumptions around “public” content and the feel-good vibe of the Library of Congress that led to a lack of critical thinking about the implications of this move.  It’s time for Twitter to start sharing more information, opening up an earnest conversation about this move.


29
Mar 10

Facebook Again to Test Privacy Boundaries

I’ve been paying attention to the discussion regarding Facebook’s proposed changes to the privacy policy (so has Michael Zimmer, TechCrunch, RWW and VentureBeat).   The most controversial is a proposal for Facebook to automatically share personal information with third party websites.  The mechanics go something like this: If you’re logged in to Facebook, and you visit a third-party site that has an established relationship with Facebook, Facebook will provide the website with your General Information, which is:

“your and your friends’ names, profile pictures, gender, user IDs, connections, and any content shared using the Everyone privacy setting.”

How would this work in practice?  Let’s imagine that CNN and Facebook team up.  If you’re logged into Facebook and visit CNN, the website would be able to welcome you by your full name, display gender-relevant content, show you recommendations from the people in your network who also visit CNN, and so on.  Going a little further, if you share your interest information, CNN might be able to dynamically display stories that match your interests.

The level of disclosure proposed in this new policy is similar (or even identical) to the information disclosure required for use of a Facebook app.  The critical difference in the new policy is that while applications require an opt-in, it appears that this new process will require an opt-out.  Facebook spokesperson Barry Schnitt:

“The opt-out hasn’t been built yet. We just want people to know they’ll be able to opt out. We’ve made that commitment. There will be an opt-out right when the user gets to the site, and there will be some opt-out functionality on Facebook. But as to where the button will be or how it will look, I don’t know, because they don’t exist right now.”

In theory, there will be two opt-outs.  The first will be the hypothetical button that Schnitt talks about.  The second will be to log out of Facebook and remove the Facebook cookie.  In reality, though, if you’re a Facebook user, you can never really opt-out, because any time a Facebook friend visits a third party site Facebook will share some of your information with that site.

Although it is a good sign that Facebook has gone on record regarding privacy control, the previous comment reveals Facebook’s cavalier attitude towards privacy.  Quite literally, they’re talking about pushing identity information of 400 million people around, yet privacy is treated as an afterthought – something they’ll figure out later.  When will companies like Facebook and Google start bringing privacy teams in at the beginning of the design process, rather than at the end?

Shifting topics a little bit, I see this move as notable because it marks Facebook’s first foray into large-scale warehoused behavioral targeting.  Targeting companies like Doubleclick (owned by Google) routinely mine our travels around the web, allowing third-party consumers to generate targeted recommendations based on our habits.  Because this happens behind the scenes, we’re less likely to notice it (which doesn’t make it any less troubling).  Facebook’s move stands to confront us with behavioral targeting, and they should consider the boundary they’re confronting.  It may not seem to be a big thing to have a third party website welcome you by your first and last name, but it is a paradigm shift on the web.

TechCrunch argues that it is time to sharpen the pitchforks, in preparation for the major backlash against the service.  Let me explain why this is frustrating.  In my opinion, the role of the privacy team is to navigate the necessary tension between our freedoms to disclose and how companies can ethically and morally profit from our data.  Facebook’s failures with Beacon or Google’s failure with Buzz are not “wins” for privacy; rather, they are losses for companies, consumers, and the market.

This brings me back to what is troubling about the “sharpening pitchforks” mentality.  It doesn’t and shouldn’t have to be this way.  Compared to Doubliclick, Facebook’s move really isn’t any more troubling – if the system is implemented properly.  And if the system is implemented properly, it could be a win – for consumers, for Facebook, and for third parties.  So how can Facebook navigate this challenge?  Let’s start with research, sensible design, and a different style of rollout than the traditional ask-for-forgiveness-later approach Facebook seems to believe in.

At Facebook’s current size and scale, they can’t afford to get this wrong.  Through research, testing, and a willingness to put the customer first, Facebook could navigate the challenges of this new feature.  But make no mistake, more than anyone, they are in the bulls eye right now.  And if Facebook does decide to play cavalier with privacy, the mobs TechCrunch describe will be waiting.


25
Feb 10

Generations and Technology

Recently, the Pew Research Center released an interesting report about Millennials, a group described as “the American teens and twenty-somethings who are making the passage into adulthood at the start of a new millennium.”  The report is fairly wide-ranging, covering topics such as values, political and civil engagement, work and education, and technology use.  Using an RDD frame, the the report provides a useful, comprehensive picture of what makes the milllenials cohort unique.

A few years ago, on recommendation from danah boyd, I read Thomas Hine’s book The Rise and Fall of the American Teenager.  It is a wonderful history of the evolution of teenage identity.  Particularly interesting was the book’s social history of the management of teens excluded from the post-Industrial workforce (e.g. the growth of high schools, greater access to college).  In Hine’s book, the teenage years are cast as something of a holding stage between adolescence and adulthood, a social limbo of sorts.

Early studies of youth, such as James Coleman’s Adolescent Society, capture society’s perception of evolving youth cohort as exotic, needing to be managed.  To his credit, Coleman humanized youth, elaborating the social structures in which high school students developed status, affinities and values.  Regardless, the exoticisim of youth is a persistent and ongoing research thread.

Recently, developmental psychologists have argued that the social “holding stage” exists up to and beyond age 25 – a period of elongated “emergent adulthood.”  The individuals in this holding stage are interesting for a number of reasons, whether it be spotting emergent social trends, identifying new technologies, forecasting the future of professions.

This brings me to the question of technology.  When we talk about technology adoption, we often frame adoption through a generational lens.  For example:  Social networking is for young peopleTeens don’t tweet.  And my favorite, the assumption that digital natives have some sort of preternatural understanding of the inner-workings of operating systems and programming languages.

As a researcher studying adoption and use of technology, there is clear evidence that technology use varies by cohort.  Digital natives may not have superhuman computing powers, but it is likely they use some technologies differently (different tools at different levels) than other age-graded groups.  Since adoption of technologies by cohort is a value-laden concept (cf. Teens Don’t Tweet), I’ll use this post to highlight a few perspectives on what differentiates cohorts.

I’ll begin with a story.  When I was in college, the “social technology” my cohort used was instant messenger.  In our use of IM, we developed norms and expectations.  We knew what kind of communication was available, we had an idea of who would be present, and we developed codes of behavior.  The culture we created still holds today, many years after college.  And although many of us use newer technologies, the value of the context we created persists.

I would bet, that if we examined our own communication patterns, we’d find that these cohort-level structures guide a good deal of our use of social technology.  There are some people we email, some people we IM, some people we Facebook, some people we Twitter, and so on.  For some of these people we’ve fit the communication tool to our need (for example, IM’ing coworkers), but for others, we fit our relationship to the tool.

The fact that we communicate with different people in different media is not new or controversial.  But I think it is informative, and can help us better understand concepts of “generations” or “cohorts” and technology use.  If social contexts provide motivation for participation in a cohort-relevant technology, then it stands that participation has lasting (generational) effects.  One reading of this could be that the late-2000′s cohort will continue to use Facebook not for technical reasons, but for social reasons.  But then again, when do we ever make decisions purely for technology’s sake, other than counting pixels on the HDTV.

Sociology offers us the life course perspective as a way to understand generations.  Instead of a deterministic, age-graded approach, the life course perspective looks at the contexts, events, and technologies that shape a cohort over time.  Applied to technology, it helps us get past perceptions that technology is “for the young” (Club Penguin aside), but that technology represents an intervening effect that shapes the cohort.

The life course perspective allows us to inspect generational use of technology through a lens of access, networks, and motivations.  The technologies cohorts retain represent our social goals, the options available at the time, and the degree to which the technology lets us stay networked.  As we move through life, participation in new cohorts may result in the adoption and maintenance of new technology over time.  And perhaps this illustrates our growing concern with the “context collapse” inherent in Facebook; as the technology has become popular, communication cohorts that existed in other technologies have jumped into the single bin of Facebook (my email and IM friends are almost all there, but not my Twitter friends).

When we talk about technology, the question of generations comes up frequently.  By applying a cohort or life course perspective, we can move past deterministic notions of what technologies are appropriate for what generation.   At the same time, this social and contextual understanding may explain some of the reasons that generations retain technology.

Image CC A-N-SA by Circulating


16
Feb 10

What Google Could Learn From Goffman

In the week since Google introduced Buzz, the most interesting thing about the fiasco has been watching the company.  For an organization as risk-averse and PR-aware as Google, a public failure offers insight that can’t be gleaned from watching daily operations.  As Google attempts to fix the problems and move the conversation onward, I thought I might reflect on some of the teachable elements of this event.

First, a little bit of back story.  As part of my fellowship at the School of Information and Library Science, I teach a course about social network sites.  Each week, I sit down with my students to discuss the social, legal, ethical and privacy implications of social network sites, among other things.  Potentially noteworthy is that my course doesn’t spend a lot of time on social network science – graph theory, quantitative analysis of networks, etc.  Rather, we concern ourselves with the interaction of people with social technology at large scale.

In our readings and discussions, we’re often challenged to think about how people present themselves in technology.  When you create a profile in a social network site, or share a stream of Tweets, you’re essentially creating a representation of an identity.  As we’ve seen time and time again in Facebook, we run into problems when identities collide during “context collapse” – when people from a different segment of your life view an identity you’ve constructed for your friends.

Taken one way, it could be argued that this problem of separate identities reveals some sort of fundamental character flaw: “Why aren’t you the same person to everyone?”  As Google CEO Eric Schmidt pointed out, “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place.”  It is the intersection of technology and philosophies like Schmidt’s that are causing companies like Google and Facebook to stumble again and again, creating “privacy nightmares.

Many of the readings in my class are influenced by Erving Goffman’s theories of identity and interaction.  Goffman, the legendary Chicago-school sociologist and former ASA president, elaborates in rich detail the process of social interaction in his books The Presentation of Self in Everyday Life, Behavior in Public Places, and Interaction Ritual.  In essence, Goffman argues that identity and interaction are performative, a concept that maps very well onto social network sites.  By “creating” identities, we’re not living dual lives, but rather engaging in a well-established performance of identity that lets us share the proper “front” in context.  We act differently on LinkedIn and Facebook because these sites have contextual norms, not because we’re duplicitous.

At the beginning of each semester of my class, I tell my students that they’re going to leave with a skillset that helps them negotiate human interaction with social technology.  I’ve sat up at night, pondering the value of such a skillset.  More than anything, the Buzz fiasco has driven home the point that we need interdisciplinary information professionals that can work with teams in negotiating the social implications of their tools.  These are the students I’m working with, and I wonder how Buzz would have rolled differently if their voices were brought to the table.

The builders of social technologies are challenged to manage the relationship between technical affordance and what is, for lack of a better term, human inertia.  That is, the tendency for people to act like people.  As Google Buzz engineers attempted to reconfigure our notions of a social group (work/friends/romantic/etc. was collapsed to “most frequently contacted”), they ran smack into human inertia.  Even though Google’s algorithms have likely figured out a more efficient way for us to group the people we know, it was simply too much to ask us to configure ourselves to the technology.

By fabricating new social groupings, Google ran head-on into Facebook’s biggest problem – that of context collapse.  When we merge social groups together, we are challenged to manage our disclosures across these groups, which have different norms of propriety.  How is it possible that Google didn’t see the potential problems of such context collapse at scale?  I’d like to offer a potential answer.

If you read a history of Silicon Valley (such as Katie Hafner’s or Michael Hilzitk’s), you’ll notice a theme of interconnection.  Silicon Valley’s tech economy is a dense series of highly entrepreneurial networks, where employment is characterized by acceptance of failure and short tenures.  The work of AnnaLee Saxenian reveals this trait as being fundamental in the Valley’s success; ideas are gestated frequently, teams assemble rapidly through the uncharacteristically large networks of oft-moving tech employees.  As good as this is for innovation, it is bad for the development of a social networking site.

Working in Silicon Valley is a classical embeddedness problem.   If you work in the Valley, it is likely that many of the people you know share similar traits.  They work at the same company as you, think about similar problems, went to similar schools.  Such homophily is beneficial for allowing entrepreneurial teams to assemble quickly, but it is bad for finding heterogenous opinions.  Consider the case-in-point of the Google Buzz test – it was rolled out initially to Google’s 20,000 employees.  These employees – similar on many traits, richly compensated, cognizant of privacy – are different in key ways from the rest of the Buzz ecosystem.  Perhaps the homophily of the test base accounts for how devastating edge-cases weren’t designed for, or perhaps groupthink shouted such possibilities down.  Either way, this is an important lesson about the pervasive problems of homophily when designing privacy systems.

While involving interdisciplinary information professionals like the ones I train in the design process would be a good step forward, it is easier said than done.  Just as Silicon Valley engineers collide with human inertia, the Valley has its own inertia of bigger, better, and faster.  Introducing the human perspective into such a culture is an ongoing, and challenging problem (see the work on Values in Design).  Right now, the market (and the opinion-sphere, to a lesser extent) regulates and acts as the proxy for human problems with systems.  I’d like to think that by introducing informed, professional voices to the discussion, we can move beyond this reactionary approach to privacy.  Perhaps Buzz is the case that moves this discussion forward.

Image used under CC-BY-ND, original source.