Privacy in the Safe Browsing API
I want to take a little time to explain how this feature works and why I like it from a privacy perspective: Firefox can check whether or not a web site is on the Safe Browsing blacklist without actually telling the API what the web site is called.
At a high level, using this API to find URLs on the "bad" list is like asking your friend to identify whether or not he likes things you show him through a dirty window. Say you hold up an apple to the dirty window and the your friend on the other side sees a fuzzy image of what you're holding. It looks round and red and pretty small, but he's not sure what it is. Your friend looks at his list of things he doesn't like and says he likes everything like that except for plums and red tennis balls. While he still does not know exactly what you're holding, you can know for sure he likes the apple.
More technically, this uses a hash function to turn web URLs into numbers. Each number corresponds to exactly one URL. For each site you visit, Firefox hashes the URL and sends the first part of the resulting number to the Safe Browsing API. The API responds with any values on the list of bad URLs that start with the value it received. When Firefox gets the list of "bad" site hash values that match the first part, it looks to see if the entire hash is in the list. Based on whether or not it's in the provided list of bad stuff, Firefox can determined whether the URL is on the Safe Browsing blacklist or not.
Consider this hypothetical example of two sites and their (fake) hash values:
Site | Hash Value |
---|---|
http://mozilla.com | 1339 |
http://phishingsite.com | 1350 |
When you visit http://mozilla.com, Firefox calculates the hash of the URL, which is 1339. It then asks the Safe Browsing API what bad sites it knows about that start with "13". It returns a list of numbers including "1350". Firefox takes that list, notices that 1339 (http://mozilla.com) is not in the list, so the site must be okay.
If you repeat the same procedure with http://phishingsite.com, the same prefix "13" is sent to the API, and the same list of bad sites (including 1350) is returned. In this case, however, the site's hash is "1350" so Firefox knows it's on the list of bad sites and gives you a warning.
For you techies and geeks out there: yeah, I'm glossing over a few protocol details, but the gist is that you don't need to tell Google exactly where you browse in return for the bad-stuff blocking.
Keeping the Safe Browsing Service Running Smoothly
Google hosts the Safe Browsing service on the same infrastructure as many of their other services, and they need to ensure that our users aren't blocked from accessing the malware and phishing blacklists as well as make sure they invest in the right resources to keep the service operating well. One of the mechanisms they need for performing this quality-of-service assurance is a cookie, so the first request Firefox makes to the Safe Browsing API results in the setting of a Google cookie.
I know that not everyone likes that cookie, but Google needs it to make sure their service is working well so I've been working with them to ensure that they can use it for quality of service metrics but not track you around the web. The most straightforward way to do this is to split the Firefox cookie jar into two: one for the web and one for the Safe Browsing feature. It's not there yet, but with a little engineering work, in a future version of Firefox that cookie will only be used for Safe Browsing, and not sent with every request to Google as you browse the web.
The cookie can be turned off entirely if you disable third party cookies in Firefox. When you turn off third party cookies, even if the cookie has been previously set your browser will not send the Google cookie -- unless you visit a Google website. You can also turn off malware and phishing protection, but I really don't recommend it.
Making "Safer Browsing"
While Firefox has been using Safe Browsing for a while, Google has started experimenting with a couple new features in Safe Browsing for additional malware and phishing filtering. Both of these new features are pretty new and it's not yet clear how effective they are or what percent of my browsing history will be traded for this improvement. Both new features involve sending whole URLs to Google and departing from Firefox's current privacy-preserving state requires evidence of a significant gain in protection. When Google measures and shares how much gain is encountered by their pilot deployment in Chrome, we can take a deeper look and consider whether these new features are worth it.
For now, Firefox users are getting a lot of protection for very little in return and there does seem to be good reason for Google to use cookies with Safe Browsing. We are always looking out for things we can do to give Firefox users both the best of privacy and security.
No comments:
Post a Comment