Web privacy is a very hard problem to solve. Well, at least what people perceive as "the privacy problem."
One of the main reasons it's hard to "solve privacy" is that the term privacy is used in many contexts to indicate many things.
This is ironic, since user data is also used in many contexts to indicate many different things. That is, a piece of data may be considered private in some contexts, but not in others.
I'd like to take a more focused approach and concentrate on one main cause of the "ZOMG my privacy is violated!!1!" uproars to see if we can't help address it.
Facebook Beacon. In late 2007, Facebook launched a new Beacon feature that caused a brisk community reaction. The feature automatically syndicated users' activities on partner sites to their Facebook news feed. For example, if you bought tickets to the Harry Potter movie on fandango.com (a Beacon partner), it might be broadcast to all your friends where and when you were going to the movie. People were mad because non-Facebook activities were now automatically imported into Facebook and shared.
Google Buzz. Google turned their new "buzz" feature on for some of google users in February 2010. This feature automatically created a twitter-like stream for things you do (such as what you read in google reader and photos you upload to picasa) and immediately connected you to "follow" other google users in your "exchanged mail with" list. Harriet Jacobs' article exemplifies the reaction. She didn't want people who emailed her on occasion to know everything she does, but suddenly this new technology connected her activities to everyone she had received mail from.
LSOs, a.k.a. Flash Cookies. When people clear their cookies, not all cookies actually get deleted! Gasp! Here's why: Adobe's Flash plug-in has its own data storage space on your computer -- separate from where your browser stores cookies, bookmarks and passwords. The browser doesn't have direct control over Flash's data, since Flash is essentially a separate application that happens to show its content inside your browser window. The result? You clear cookies, but your browser doesn't know how to clear flash cookies. How is this used? In many ways, but one particular sneaky use rubs many people the wrong way: web sites can use flash to keep longer lived cookies on your system that can be used to re-populate regular cookies after you clear them. People are mad. (FYI, this is being worked out, see this bug).
The Gap. There's this dark and mysterious area between what users think is happening with the data they put on the web and what actually happens. I call this the Privacy Perception Gap (PPG). There are a variety of reasons this gap exists:
- Software makers are not psychologists -- they don't know what people expect, only how the system works.
- Software makers are not anthropologists -- they don't know how different cultures expect secrets to be kept or shared.
- Software is reactive -- users complain, software is re-engineered, and the cycle repeats
- The PPG is not well understood
This last reason is something we can address with proper research. First, we need to understand the size and reason for the PPG before we can close it, especially before we know who is best poised to do the work. Is it users, user agents, infrastructure, applications, or a combination who should take the giant leap? How big is this gap on average? Surely it's different for various web applications.
If we minimize the PPG, we can expect users to be better informed, and that may have solved the variety of situations enumerated above. Users wouldn't be surprised with what happens, and the suspicion that web companies are out to violate their users would be reduced significantly.
I'm a big fan of transparency (see Open and Obvious), as it is a big part of the PPG problem. We should start by making data relationships transparent: this includes disclosure first and then most importantly user accessibility second. For instance, having a privacy policy linked from my web site doesn't really make me transparent unless users can find and understand it. The gap doesn't shrink if users don't understand! My theory is that an informed user is a happy user, and if we can better understand the PPG we can take the first step towards making web users happy.