Thursday, December 27, 2012

what is privacy?

Often times when I find myself in a conversation about Privacy, there's a lack of clarity around what exactly we're discussing.  It's widely accepted that people who are experts on privacy all speak the same language and have the same goals.

I'm not so sure this is true.

This came up in a discussion with Jishnu yesterday, and we needed a common starting place.  So I'd like to take a little time to lay out what I'm thinking when I talk about Privacy, especially since I'm mainly focused on empowering individuals with control over data sharing and not so much on keeping secrets.
Privacy is the ability for an individual to have transparency, choice, and control over information about themselves.
At the risk of sounding too cliché, I'm gonna use a pyramid to explain my thinking.  There are three parts to establishing privacy:

First, an organization's (or individual's) collection, sharing and use of data must be transparent.  This is crucial because choice and control cannot be realized without honesty and fairness.

Second, individuals must be provided choice.  This means data subjects (those people whose data is being collected, used or shared) must be able to understand what's going to happen with their data and have the ability to provide dissent or consent.

Third, when it's clear what's happening and individuals have an understanding about what they want, they must be given control over collection, sharing or use of the data in question.

This means control depends on choice which depends on transparency.  You cannot make decisions unless you're given the facts.  You cannot make your desires reality unless you've decided what you want.

For the engineers out there (like me), this dependencies can be modeled as such:
[Transparency] = Awareness of Data Practices
[Choice] = [Transparency] + Individual's Wants
[Control] = [Choice] + Organizational Cooperation
Control is the goal, but it requires Transparency and Choice to work -- as well as some additional inputs.  Privacy is the whole thing: all three pieces acting together with support from both data controllers and data subjects to empower individuals with a say in how their data is used.

The privacy perception gap is a symptom of ineffective transparency and choice; it is the result of peoples' inability to really understand what's going on so they have no chance to establish positions about what is okay.  When transparency and choice are built into a system, the gap shrinks and people have most of what they need to regain control over their privacy.

What is privacy to you?

Thursday, October 11, 2012

ownership and transparency in social media

Les Writes:
"You don’t own the spaces you inhabit on Facebook. You’re enjoying a party at someone’s house, and you barely know the guy. In fact, your content is the currency that pays for the booze (ie. the privilege of using their servers). That’s why it’s free-as-in-beer: You’ve given them what you post, instead of money. That’s valuable stuff, if they can ever quite figure out how to sell it."  [link]
It's not completely fair to expect that FB users realize the data about them that they so generously contribute to FB no longer belongs to them.  My hypothesis is that many people feel that no matter who has facts about you and prints them, they're still *yours*.  After all, companies have trademarks, can't things about me be mine and reserved for me?

On a smaller scale, the monetization of facts about me is not surprising; I give an interview to a magazine, they print it, it gets syndicated, no surprise.  On a large scale (lots of data collection,  frequently) I think people lose track of with whom they are communicating and get immersed in the task at hand.  Is it my FB friends, or is it FB, who is helpfully telling my friends things?  This system is flexible, crazy, complex, shiny and distracting!  Can I use it to video chat with my friends?  That's neat.  Oh, geez, I forgot FB is in the middle of all this communication...

People who sign up for FB are not signing up to contribute their life to this stranger throwing a party.  They sign up assuming it is a tool they can use to communicate with their friends; it is a machine they've "bought" (for free, heh) to help them communicate.  Nobody reads the terms of service.  Nobody reads the privacy policy.  They accept them since other people have and only read what their friends write.  Many are in denial or do not realize that what they contribute to the site is just that: a contribution.

I think there is shared responsibility here; consumers should be a little bit wary--but this isn't their area of expertise.  As such, the site operator also has a duty to be more forthcoming with what's going on.  My communications tool is supposed to be a communications tool.  If you market it as a "free communications tool that sells my data," I am better informed than if it's just marketed as a "communications tool."

Tuesday, May 22, 2012

Adding Privacy to Apps Permissions

I've been thinking about app permission models, especially as we're working on B2G and need a way for users to safely and thoughtfully manage the apps on their device.  Most permission models strive to do precisely one thing: allow apps to ask for consent to use features.

The problem I have with "allow/deny" consent to use features is that there's not a clear usage intention in having the access; a mirror app that asks for access to your camera probably doesn't need to store data it gets from the sensor, but it could go so far as to store video (and perhaps send it to "sneakyprivacyinvadors.com" to spy on you).

If apps can explain their usage intentions, consumers of the apps have more context and can make better decisions about the permissions they grant.  While the software probably can't make sure the usage intentions are actually followed, this commitment to customers puts the app developers on the hook for doing the right thing.

Head on over to the discussion in mozilla.dev.webapps where I've posted my thoughts, and let us know what you think.

Edit (23-May-2012 / 9:33 PDT): Google Groups (the public archive) did not pick up my original post to the group.  If you're not subscribed via NNTP or the dev-webapps mailing list, you can see my original post in the quoted text of the first reply by Paul.

Monday, March 12, 2012

making DNT easier for web sites

Jos Boumans has done some analysis about the effect of turning on Do Not Track in your browser, and his findings show that sites in general are slow to show that they support the feature.
"As it stands, only 4 out of 482 measured top 500 sites are actively responding to the DNT header being sent." (Link)
As a user, it's hard to tell if sites are honoring my Do Not Track request, and as a site developer, it might be a daunting task to hack up my back-end code.  The W3C Tracking Protection working group at the W3C are working on helping out transparency and implementations, but in the meantime Jos has released his mod_cookietrack apache module to make it easier for site owners to track their users' clicks in a respectful way -- right now.
The Apache module, mod_cookietrack, does all sorts of stuff like mod_usertrack, but one thing it does better is honor DNT; if a server using this module sees "DNT: 1" in an HTTP request, it replaces the tracking cookie with one that says "DNT" -- something that's not unique to a visitor.

Apparently it was a lot of work to get DNT supported properly in mod_cookietrack, a native browser module that performs well and is safe on multiple threads, so thanks Jos for your hard work so that more organizations can support DNT on their web sites.

More:

Sunday, February 26, 2012

Malware and Phishing Protection in Firefox

For a while, Firefox has included malware and phishing protection to keep our users safe on the web.  Recently, Gian-Carlo Pascutto made some significant improvements to our Firefox support for the feature, resulting in much more efficient operation and use of the Safe Browsing API for this protection.

Privacy in the Safe Browsing API

I want to take a little time to explain how this feature works and why I like it from a privacy perspective:  Firefox can check whether or not a web site is on the Safe Browsing blacklist without actually telling the API what the web site is called.

At a high level, using this API to find URLs on the "bad" list is like asking your friend to identify whether or not he likes things you show him through a dirty window.  Say you hold up an apple to the dirty window and the your friend on the other side sees a fuzzy image of what you're holding.  It looks round and red and pretty small, but he's not sure what it is.  Your friend looks at his list of things he doesn't like and says he likes everything like that except for plums and red tennis balls.  While he still does not know exactly what you're holding, you can know for sure he likes the apple.

More technically, this uses a hash function to turn web URLs into numbers.  Each number corresponds to exactly one URL.  For each site you visit, Firefox hashes the URL and sends the first part of the resulting number to the Safe Browsing API.  The API responds with any values on the list of bad URLs that start with the value it received.  When Firefox gets the list of "bad" site hash values that match the first part, it looks to see if the entire hash is in the list.  Based on whether or not it's in the provided list of bad stuff, Firefox can determined whether the URL is on the Safe Browsing blacklist or not.

Consider this hypothetical example of two sites and their (fake) hash values:

SiteHash Value
http://mozilla.com1339
http://phishingsite.com1350

When you visit http://mozilla.com, Firefox calculates the hash of the URL, which is 1339.  It then asks the Safe Browsing API what bad sites it knows about that start with "13".  It returns a list of numbers including "1350".  Firefox takes that list, notices that 1339 (http://mozilla.com) is not in the list, so the site must be okay. 

If you repeat the same procedure with http://phishingsite.com, the same prefix "13" is sent to the API, and the same list of bad sites (including 1350) is returned.  In this case, however,  the site's hash is "1350" so Firefox knows it's on the list of bad sites and gives you a warning.

For you techies and geeks out there: yeah, I'm glossing over a few protocol details, but the gist is that you don't need to tell Google exactly where you browse in return for the bad-stuff blocking. 

Keeping the Safe Browsing Service Running Smoothly

Google hosts the Safe Browsing service on the same infrastructure as many of their other services, and they need to ensure that our users aren't blocked from accessing the malware and phishing blacklists as well as make sure they invest in the right resources to keep the service operating well.  One of the mechanisms they need for performing this quality-of-service assurance is a cookie, so the first request Firefox makes to the Safe Browsing API results in the setting of a Google cookie.

I know that not everyone likes that cookie, but Google needs it to make sure their service is working well so I've been working with them to ensure that they can use it for quality of service metrics but not track you around the web.  The most straightforward way to do this is to split the Firefox cookie jar into two: one for the web and one for the Safe Browsing feature.  It's not there yet, but with a little engineering work, in a future version of Firefox that cookie will only be used for Safe Browsing, and not sent with every request to Google as you browse the web.

The cookie can be turned off entirely if you disable third party cookies in Firefox.  When you turn off third party cookies, even if the cookie has been previously set your browser will not send the Google cookie -- unless you visit a Google website. You can also turn off malware and phishing protection, but I really don't recommend it.

Making "Safer Browsing"

While Firefox has been using Safe Browsing for a while, Google has started experimenting with a couple new features in Safe Browsing for additional malware and phishing filtering.  Both of these new features are pretty new and it's not yet clear how effective they are or what percent of my browsing history will be traded for this improvement.  Both new features involve sending whole URLs to Google and departing from Firefox's current privacy-preserving state requires evidence of a significant gain in protection. When Google measures and shares how much gain is encountered by their pilot deployment in Chrome, we can take a deeper look and consider whether these new features are worth it.

For now, Firefox users are getting a lot of protection for very little in return and there does seem to be good reason for Google to use cookies with Safe Browsing.  We are always looking out for things we can do to give Firefox users both the best of privacy and security.