Posts Tagged: google


16
Feb 10

What Google Could Learn From Goffman

In the week since Google introduced Buzz, the most interesting thing about the fiasco has been watching the company.  For an organization as risk-averse and PR-aware as Google, a public failure offers insight that can’t be gleaned from watching daily operations.  As Google attempts to fix the problems and move the conversation onward, I thought I might reflect on some of the teachable elements of this event.

First, a little bit of back story.  As part of my fellowship at the School of Information and Library Science, I teach a course about social network sites.  Each week, I sit down with my students to discuss the social, legal, ethical and privacy implications of social network sites, among other things.  Potentially noteworthy is that my course doesn’t spend a lot of time on social network science – graph theory, quantitative analysis of networks, etc.  Rather, we concern ourselves with the interaction of people with social technology at large scale.

In our readings and discussions, we’re often challenged to think about how people present themselves in technology.  When you create a profile in a social network site, or share a stream of Tweets, you’re essentially creating a representation of an identity.  As we’ve seen time and time again in Facebook, we run into problems when identities collide during “context collapse” – when people from a different segment of your life view an identity you’ve constructed for your friends.

Taken one way, it could be argued that this problem of separate identities reveals some sort of fundamental character flaw: “Why aren’t you the same person to everyone?”  As Google CEO Eric Schmidt pointed out, “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place.”  It is the intersection of technology and philosophies like Schmidt’s that are causing companies like Google and Facebook to stumble again and again, creating “privacy nightmares.

Many of the readings in my class are influenced by Erving Goffman’s theories of identity and interaction.  Goffman, the legendary Chicago-school sociologist and former ASA president, elaborates in rich detail the process of social interaction in his books The Presentation of Self in Everyday Life, Behavior in Public Places, and Interaction Ritual.  In essence, Goffman argues that identity and interaction are performative, a concept that maps very well onto social network sites.  By “creating” identities, we’re not living dual lives, but rather engaging in a well-established performance of identity that lets us share the proper “front” in context.  We act differently on LinkedIn and Facebook because these sites have contextual norms, not because we’re duplicitous.

At the beginning of each semester of my class, I tell my students that they’re going to leave with a skillset that helps them negotiate human interaction with social technology.  I’ve sat up at night, pondering the value of such a skillset.  More than anything, the Buzz fiasco has driven home the point that we need interdisciplinary information professionals that can work with teams in negotiating the social implications of their tools.  These are the students I’m working with, and I wonder how Buzz would have rolled differently if their voices were brought to the table.

The builders of social technologies are challenged to manage the relationship between technical affordance and what is, for lack of a better term, human inertia.  That is, the tendency for people to act like people.  As Google Buzz engineers attempted to reconfigure our notions of a social group (work/friends/romantic/etc. was collapsed to “most frequently contacted”), they ran smack into human inertia.  Even though Google’s algorithms have likely figured out a more efficient way for us to group the people we know, it was simply too much to ask us to configure ourselves to the technology.

By fabricating new social groupings, Google ran head-on into Facebook’s biggest problem – that of context collapse.  When we merge social groups together, we are challenged to manage our disclosures across these groups, which have different norms of propriety.  How is it possible that Google didn’t see the potential problems of such context collapse at scale?  I’d like to offer a potential answer.

If you read a history of Silicon Valley (such as Katie Hafner’s or Michael Hilzitk’s), you’ll notice a theme of interconnection.  Silicon Valley’s tech economy is a dense series of highly entrepreneurial networks, where employment is characterized by acceptance of failure and short tenures.  The work of AnnaLee Saxenian reveals this trait as being fundamental in the Valley’s success; ideas are gestated frequently, teams assemble rapidly through the uncharacteristically large networks of oft-moving tech employees.  As good as this is for innovation, it is bad for the development of a social networking site.

Working in Silicon Valley is a classical embeddedness problem.   If you work in the Valley, it is likely that many of the people you know share similar traits.  They work at the same company as you, think about similar problems, went to similar schools.  Such homophily is beneficial for allowing entrepreneurial teams to assemble quickly, but it is bad for finding heterogenous opinions.  Consider the case-in-point of the Google Buzz test – it was rolled out initially to Google’s 20,000 employees.  These employees – similar on many traits, richly compensated, cognizant of privacy – are different in key ways from the rest of the Buzz ecosystem.  Perhaps the homophily of the test base accounts for how devastating edge-cases weren’t designed for, or perhaps groupthink shouted such possibilities down.  Either way, this is an important lesson about the pervasive problems of homophily when designing privacy systems.

While involving interdisciplinary information professionals like the ones I train in the design process would be a good step forward, it is easier said than done.  Just as Silicon Valley engineers collide with human inertia, the Valley has its own inertia of bigger, better, and faster.  Introducing the human perspective into such a culture is an ongoing, and challenging problem (see the work on Values in Design).  Right now, the market (and the opinion-sphere, to a lesser extent) regulates and acts as the proxy for human problems with systems.  I’d like to think that by introducing informed, professional voices to the discussion, we can move beyond this reactionary approach to privacy.  Perhaps Buzz is the case that moves this discussion forward.

Image used under CC-BY-ND, original source.


24
Jun 09

The Great Wall of Facebook

Fred Vogelstein has an interesting article in the new edition of Wired, previewing Facebook’s full-on assault of Google for targeted advertising territory.  The article makes news, and includes some great (and painfully ironic quotes) from Mark Zuckerberg in which he accuses Google of contributing to the surveillance society (Pot, Kettle, Black).  The article reads like a preview for the Super Bowl, with notoriously tight-lipped executives tossing bombs back and forth.  Congrats to Vogelstein for successfully stoking the ire of these monoliths.

The fundamental conflict of the article lies in the comparison of the advertising products offered by the two companies.  Google’s product, targeted text ads, is the single most successful product on the Internet.  The tiny, unobstructive ads have fueled Google’s dominance in multiple markets; today, 90% of Google’s revenue comes from Adsense.  Facebook’s product is nascent – it is the concept that advertising works better when it is socially mediated.  That is, we are more likely to click on ads, content, and links when the content is funneled through our friends.  This theory is sensible, but to date, Facebook’s concept remains vaporware, with a majority of their revenue coming through traditional targeted text and banner campaigns.

Framed by Zuckerberg, the contrast between Facebook and Google is personal vs. impersonal.  Of Google he states: “You have a bunch of machines and algorithms going out and crawling the Web and bringing information back.  That only gets stuff that is publicly available to everyone. And it doesn’t give people the control that they need to be really comfortable.”  Vogelstein writes:

Facebook CEO Mark Zuckerberg envisions a more personalized, humanized Web, where our network of friends, colleagues, peers, and family is our primary source of information, just as it is offline. In Zuckerberg’s vision, users will query this “social graph” to find a doctor, the best camera, or someone to hire—rather than tapping the cold mathematics of a Google search. It is a complete rethinking of how we navigate the online world, one that places Facebook right at the center. In other words, right where Google is now.

Personal vs. impersonal.  Wouldn’t you rather get a doctor recommendation from ten of your friends than a text link?  The value of peer recommendations have driven many communities, including countless bulletin boards and fora, sites like epinions and Yelp, and members-only specialist communities.  The fundamental problem with monetization in Facebook’s case lies with norms that govern the exchange of advice, particularly that the advice be truthful and unbiased.  If we are to trust advice, we must know that external agents aren’t corrupting or influencing the transmission of advice.  We can get advice from Facebook regrading doctors, but we won’t trust the advice if Facebook pays our friends to recommend certain doctors.

Facebook’s grand vision involves a wholly-contained world of social information that is brokered out through the web.  With enough critical mass, it is argued, most of our common information needs can be answered by our social networks.  With most technological main effect hypotheses, the formulation is generally suspect.  Researchers of social support argue that support is more effectively derived from certain actors, that support is contextual, etc.  In a traditional model, where the people around you are the primary producers of information, your personal support network is crucial.  With the advent of the Internet, however, most of us no longer exist in a traditional model where the people around us are our only support vector (1).

The reality is that Google, and other search engines, have restructured expectations regarding everyday information seeking.  It is no longer good enough to simply get recommendations from a personal network when there is a vast quantity of electronic information available at one’s fingertips.  You can certainly get doctor recommendations from your friends, but the online search for information about the doctor is now a natural part of the information seeking process.  In this sense, Facebook is complementary, providing an important but not all-encompassing factor in our decision making process.  The argument that individuals will move their information seeking to a social network, and away from the mechanistic site Google simply assumes too much.  Google has already won by making itself an integral part of our everyday information seeking processes.

If Facebook (a proxy for “socially mediated search”) is a complementary and useful part of everyday information seeking, we must consider the relevance of information we get from the site.  We generally assess relevance in information systems through “recall” and “precision.”  In Facebook, recall is strictly bound to our known social world – the people who we have connected with.  Therefore, precision is a function of how well the various others producing results match our needs.  If you have 500 friends, spaced across a variety of age ranges, is it safe to assume that information you get from the network will actually be all that relevant?  Our core social networks are generally homophilous, but our core social networks are very small.  Expand past a certain network size and it becomes likely the interests and experience of your “friends” will vary significantly from yours.

Facebook could address this problem with friend lists, the privacy feature that compels individuals to place their friends in groups.  Perhaps friend lists could be converted to interest groups (People whose book recommendations I trust), but the mechanics of a process would require a good bit of intervention on behalf of the user.  The participation gap is also problematic – if the people who you really trust for book recommendations are not heavy users of Facebook, then it is unlikely you’ll have your information needs addressed.

Facebook could develop algorithms that look for similarity between question askers and answerers – if I ask for a book recommendation, perhaps Facebook could weight responses from people who share my stated book tastes.  This compels participation and broadcast of information, one of Michael Zimmer’s new laws of social networking.

Although the debate framed by Vogelstein and Zuckerberg is Facebook vs. Google, there is actually very little opportunity for Facebook to significantly edge into Google’s core market – targeted text-link ads.  Text link ads are served as a by-product of information search, which is an integral part of our everyday information seeking processes.  Facebook is likely to emerge as a complement to search, and in some areas it may perform better than search, but search will remain relevant.  The challenge to Facebook is to find a way to monetize their value areas without being in contravention of social norms.  The challenge to Google is to get access to the wealth of personal data Facebook is collecting (and no, Google Friend Connect and all of their other terrifically lame social products, will solve this problem).  For the consumer, the battle between Google and Facebook is a win-win, with the obvious exception of privacy matters.

(1) Those with “impoverished life-worlds” – those with limited access to information and resources, are unlikely to incorporate search engines or social networks into their everyday information search processes.


12
May 09

Google exposes Booksearch patron records

Evil settlement aside, I’m a fan of Google Booksearch.  The ability to search within books is tremendously useful, and I look forward to the day that I’ve got a digital copy of all of the books on my shelves.

Until recently, I’ve kept track of interesting books in Google Booksearch by bookmarking them in my browser.  This approach isn’t scaling well, so I decided to take advantage of Google’s native features by saving the books to my “Google Library.”  I was shocked to find out that saving a book to your library requires that the book be added to your “shared library”, a public listing tied to your Google account.

There is no way to save a book privately in Google Booksearch.  As Google writes in their FAQ, “When you add reviews, ratings, notes, or labels to a book—or when you add a book to your my Library page—that information will be publicly displayed on Google Book Search.”  They go on to write that “No matter where you use these features, the information you submit will be displayed publicly.”

I couldn’t believe it either.  If you want to set up a Google Library, even if it is just for convenience sake, you have to show the world what you’ve been reading.  As far as I can tell, there’s no good technical or legal reason why one can’t save a book privately, or limit their book-sharing to a group of friends.  This decision seems arbitrary and downright scary (or at least terribly ill-advised).

The cognitive dissonance comes from comparisons of Google’s Library policy to traditional libraries.  Prominent in the ALA Code of Professional Ethics for Librarians is section 40.2.3: “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”  The ALA also formally recommends that library administrators “advise all librarians and library employees that such records shall not be made available to any agency of state, federal, or local government except pursuant to such process, order or subpoena as may be authorized under the authority of, and pursuant to, federal, state, or local law relating to civil, criminal, or administrative discovery procedures or legislative investigative power.”  (for more on regulation and library records see Minow, 2002)

Therefore, I must wonder why Google is not adhering to ALA policy, and the broader cultural norm of protecting library patron privacy.  As Google partners with large institutions and attempts to monetize Booksearch, failing to respect patron privacy seems foolish and potentially dangerous.  A patron researching a sensitive topic, or a topic that reveals information about the patron (for example, books about a health condition) will have their information revealed publicly if they add such a book to their library.

Google is clearly wrong on this issue, and must work to fix this dangerous privacy oversight.  Have other librarians addressed this issue?  Has Google responded?  Unfortunately most of my due-dilligence for this post found articles/blog posts about the booksearch settlement, but I’d like to hear some other opinions.

Update: The Google Booksearch FAQ states that users may delete their data from public records.  However, the link they provide doesn’t work (it is a 404), and it appears you have to delete all of your records (“Delete book search”) to remove book history from the public view.


19
Mar 09

Turow on Behavioral Targeting

Saul Hansell reports on Joseph Turow’s proposal for awareness of behavioral targeting:

I’m coming to the conclusion that each advertisement on a page has to speak for itself. That’s implicit in the approach Google is taking for its new behavioral targeting system. It puts the phrase “Ads by Google” on all its advertisements. Click that link and you’ll get some limited information about Google’s targeting system and an ability to adjust some of the interests that Google is tracking.

But Google’s approach is presented in a way that glosses over what they are doing and discourages people from reading the disclosure and exercising control, says Joseph Turow, a marketing professor at the Annenberg School for Communication of the University of Pennsylvania.

Mr. Turow has developed a plan that is simpler and more comprehensive: Put an icon on each ad that signifies that the ad collects or uses information about users. If you click the icon, you will go to what he calls a “privacy dashboard” that will let you understand exactly what information was used to choose that ad for you. And you’ll have the opportunity to edit the information or opt out of having any targeting done at all.

via An Icon That Says They’re Watching You – Bits Blog – NYTimes.com.


8
Mar 09

Amazon to Google Booksearch in one click

Google Booksearch is becoming one of my go-to scholarly resources.  All of the evilness aside, it is extremely useful to be able to look up a chapter or section from a book (even if that book is on the shelf in the other room). Since I manage my reading lists with Amazon, I wanted to make it very easy to look up books in Google Booksearch from Amazon. So I created the following bookmarklet:

Booksearch Lookup

bksrch

When you’re on an Amazon product page, click this bookmarklet and you’ll be taken to the Google Booksearch results for the book.  If previewing is allowed for the book, you’ll be able to leaf through it before you purchase/borrow/walk to your shelf.  To install the bookmarklet, drag the booksearch lookup link to your bookmarks folder.

Some quick notes on Booksearch:

  • Booksearch has changed the way I look at digital books (for the better).  I’m a fan of print, and I’ve always had a hard time imagining reading a book on the computer.  I still have a hard time with digital long form, but the mistake I made was to think all books were the same.  Many books, especially the reference/textbook/manual genre are analogous to large webpages.  If you’re searching for a specific bit of information and Google Booksearch can give you the chunk you need, that’s a wondeful case.
  • Booksearch has also changed how I look at publishers and libraries.  You know how today if you buy an LP, a band will throw a CD in for free?  Publishers have to get there, and fast.  Libraries need to give me a virtual shelf that houses digital copies of all the books I’ve checked out (and even the ones I’ve returned).  We’re simply wasting too much time and money chasing around print resources when a digital resource will do.
  • It is unfortunate that Google is the monopoly, but you have to give them credit for taking on a task that would have taken an inter-intitutional consortium eons.  Sometimes the market wins.  I just wish that the research libraries had thought twice before signing their collections over in perpetuity.
  • Finally, I remember a time (not long ago) where music was a scarce resource.  To hear a band, you actually had to find a copy of an album or swap a tape.  Lots of stuff was like that pre-digital.  One of the few places I see that attitude today is around the scholarly book.  If there’s a book you need, you’ve got to search it out.  If your library doesn’t have it, if ILL is going to take 6 months, if none of your friends are hoarding a copy, you’ve got to plunk down the 50 or 100 or 150 dollars to order the book from somewhere far away.  It is totally frustrating, but there’s also a weird sense of pre-digital accomplishment that goes with it – knowing that you posess an actual scarce resource.  I know that in a few years my students will just booksearch every version of that book I spent so much time and effort to acquire.  I imagine it will feel a little like knowing that there’s a torrent of all the 7″ your favorite band put out, when you worked so hard just to collect a few.  Bottom line is we’ll have to get over it, albeit grudgingly.

1
Mar 09

Managing Literature Alerts with Gmail

If you research an emerging topic, it is likely that you use some form of literature alert.  If you’re unfamiliar with literature alerts, they are notifications provided by publishers and digital libraries to inform you of new content as it is released.  Managing these alerts can be challenging, so I thought I’d share my system.   At a very high level, I manage literature with Gmail labels.  My system is pretty simple, but it has been working for a year or so I’ve used it.

The first step has two parts.  If you don’t have a Gmail account, I assume that you know how to fix that.  Lit alerts are a little more challenging, as different domains will have different publishers.  If you’re doing the kind of research I do, then setting up alerts with Sage, ScienceDirect and the ACM Digital Library (ToC alerts are free, but search alerts require an ACM membership) is a good start (Springer, Wiley and IEEE are also useful).   You’ll need to create accounts with all of these sites for lit alerts to work.

Alerts

Literature alerts come in two forms (as far as I know).  The first is a table of contents alert.  This means you can get notified when a new journal or proceedings is published.  The second is a search alert.  Search alerts are saved searches (i.e. Facebook AND College Student); the system notifies you when new results are found.  You’ll want to set up these alerts and direct them to your Gmail account.

Search

Over the next few days your inbox will begin filling with literature alerts (assuming you’re looking at an active subject).  Because you’re not always going to want an inbox filled with lit alerts, what you’re going to do is set up filters.  For each publisher that emails you, click on the email and select “Filter all messages like this” from the dropdown.  I then set the filter to skip the inbox, and apply the label “Alerts.”  After a few days, you’ll have filtered all of the alert messages to a label – meaning you can process them on your own time.

Filter

alertbox

Two important notes.  First, when signing up for searches, opt in to get the most verbose alerts possible.  You want abstracts, etc.  Second, rather than deleting alerts after they are done, you’re simply going to leave them read in the labeled folder.  Here’s where the fun begins.  Over time, you’re building a portable, personal archive of all new literature on your topic.  And because you’ve set up the alerts across publishers and libraries, you’ll be able to search for new literature across publications easily – without authenticating to a library or running a meta search across publishers.  All of the new literature will be in your gmail, searchable with the “label:alerts” key.  For example, if I want to know all of the new literature matching Facebook and psychology, I simply go into my Gmail and search “label:alerts facebook psychology.”

fbpsych

This kind of management strategy would also work for mailing lists, fare alerts from airlines, etc. In my dreams I’d have a Gmail plugin that would add impact factors in to the subject headings. The rest of my literature alerts come in via RSS (lots of open-access journals only offer RSS alerts), and I’m slowly moving those over email (via RSS-to-email). How do you manage your literature alerts?


1
Mar 09

Citation Searching in Google Scholar

One of my favorite features in Google Scholar is its “cited by” function.  Cited by allows you to see all of the items in Google Scholar that cite the pulbication you were searching for.  In comparison to Web of Science, GS has much greater recall, which is useful when you’re investigating a new topic.

The problem with GS cited by is that there is no easy means for searching within the results.  This is fine if your publication is cited only a few times and you can eyeball the results.  But as the citation count scales up, being able to search within the results becomes pretty important.

The good news is that you can search within GS cited by, it just requires a little URL hacking.  In my case, I was looking for publications about web surveys that cite the Reeves and Nass book “The Media Equation.”  We’ll do this step by step:

  1. Open up GS, and search for “The Media Equation
  2. The first result is the Reeves and Nass book.  Click on the “Cited by 1598” link.
  3. The URL will look something like this:

    http://scholar.google.com/scholar?num=50&hl=en&lr=&cites=12773235514158955901

    You will want to select that list bit, the “&cites=12773235514158955901″.

  4. Now, open up GS in a new tab and run a search for “Web Survey.”
  5. Finally, paste the “&cites=12773235514158955901″ onto the end Web Survey URL, so it looks something like this:

    http://scholar.google.com/scholar?num=50&hl=en&lr=&q=Web+Survey&btnG=Search&cites=12773235514158955901

  6. Voila!  You’ve found the 337 publications matching Web Surveys that cite the Reeves and Nass book.  The first one looks like a very promising publication from some highly regarded methodologists.  Win!

”gsresults”

I was unable to run a comparison in the WoS database as it doesn’t seem to know about the Reeves and Nass book.  Are there any other places you use for Cited By searches (i.e. other databases, vendors, search engine hacks)?  And if there is some easy way to do this search in the GS interface, please let me know.  I’ve read the advanced searching docs and researched this, but it doesn’t appear there is a simple way to search within citations.