Web browser history detection
September 3, 2009 9:06 AM   Subscribe

What the Internet knows about you. "This project was started by a small group of Web developers and security researchers in order to highlight the problem of Web browser history detection -- a problem which can dramatically affect the Web and hurt many people, if not solved quickly. Our direct goal is to educate the mainstream public and show them the direct consequences of allowing this aspect of Web browser behavior, as well as provide some solutions which mitigate the problem. However, since there are no existing satisfactory solutions, our other objective is to point the attention of browser developers to this issue and strongly encourage them to implement the necessary and long-overdue fixes." [Via]
posted by homunculus (44 comments total) 13 users marked this as a favorite
 
Apparently the internet doesn't know anything about me.
posted by Optamystic at 9:12 AM on September 3, 2009


doesn't work with lynx =p
posted by nomisxid at 9:15 AM on September 3, 2009 [4 favorites]


knows nothing about me either...

but that "cute kitten" link at the bottom works just fine...
posted by HuronBob at 9:15 AM on September 3, 2009


I solve that problem by staying off the internet.
posted by Danf at 9:17 AM on September 3, 2009


Bummer. I'm pedestrian. Yup, I really, really like popular websites.
posted by Ruthless Bunny at 9:17 AM on September 3, 2009


Nope, nothing about me. Hangs at stage 2 of 2. Safari on OSX.
posted by Biru at 9:20 AM on September 3, 2009


My weather.gov addiction has been revealed.
posted by swift at 9:22 AM on September 3, 2009


Direct link to the article on BB, for posterity.

Also via BB (a few posts newer): The death of "locational privacy"
[I]n low-tech days, our movements were not entirely private. The desk attendant at my gym might have recalled seeing me, or my colleagues might have remembered when I arrived. Now the information is collected automatically and often stored indefinitely.
They see you when you're surfing, they know when you're running late, they know if you're being bad or good, so be good for goodness sake.
posted by filthy light thief at 9:23 AM on September 3, 2009 [1 favorite]


Followed-link history sniffing has been a massive hole forever. I'm glad to see it getting some attention.
posted by rokusan at 9:26 AM on September 3, 2009


biru - Safari 4.0.3 on OS X Snow Leopard here - it seemed to take a while at the "stage 2 of 2", but it did complete for me.
posted by kcds at 9:31 AM on September 3, 2009


It showed Bing.com which I've been to once and no Metafilter pages at all. Fail. Or pass. I'm not sure.
posted by like_neon at 9:31 AM on September 3, 2009


Cabela's is a bank? Really? Some strange classifications there.
posted by Weighted Companion Cube at 9:32 AM on September 3, 2009


Well... Awesome. The internet thinks I'm far more productive than I actually am.

I salute you, internet, and your browsing history "problem."
posted by pokermonk at 9:41 AM on September 3, 2009 [1 favorite]


The work around is to simply have your browser clear its history/cookies/etc each time you close it. Firefox and other browser's private modes accomplish this.

Anybody know of a well-regarded firefox extension that automatically removes any flash cookies when you close the browser?
posted by jsonic at 9:41 AM on September 3, 2009


Is this something solved by clearing your history?
posted by Mental Wimp at 9:44 AM on September 3, 2009


[I]n low-tech days, our movements were not entirely private. The desk attendant at my gym might have recalled seeing me, or my colleagues might have remembered when I arrived. Now the information is collected automatically and often stored indefinitely.

I have to point out that "privacy" by means of bad memories cuts both ways: People can "remember" seeing you when you weren't there just as easily as not remembering you when you were.
posted by Mental Wimp at 9:47 AM on September 3, 2009


I cleared my web history late last night, so the site didn't find anything I visited before that point. Even with the history my browser was storing, it missed:

- Every government site I visited
- Every online store I'd visited
- Every YouTube search I've made since then
- Ditto for Google searches
- Every non-cached site I had visited in the previous five to ten minutes

This is all after I made an exception for their site through Lil Snitch. Before that, their pages just broke, and wouldn't run the searches in their sidebar at all.

I don't understand the "no satisfactory solution" warning on their site. Certainly this is an issue for the casual surfer, but an easy to implement solution seems already prevalent. Why the scare tactic?
posted by greenland at 9:53 AM on September 3, 2009


sucks to your noscript
posted by boo_radley at 9:54 AM on September 3, 2009


Top 5K websites:
"What? The Pirate Bay? Wha... but I swear I would never...!"

XXX:
Congratulations, we did not find anything in this category in your browser history.
Feel free to try our other browser history test

"You're darn tootin'."

Cute Kitten:

"YAY! And ... there are more kittens?........"
Sorry there are no more kittens.

"I hate you, Internet."
posted by louche mustachio at 9:59 AM on September 3, 2009


The problem here is not that a page is able to figure out what other pages you've visited, but that it can then send that information back out. If a page were locally able to do this sniffing but prevented from ever communicating it back, it wouldn't be a big deal.

A possible solution would be that the browser renders two version of the page: a sanitized version that is not allowed to look at history and a privileged version that can look at history (and maybe other semi-sensitive information). The key is that the privileged version exists in a sandbox that is not allowed any outgoing network communication. For Javascript gui stuff to work there will need to be some one way communication from the unprivileged page to the privileged one, which could be tricky to do in a way that doesn't break most existing pages, but not impossible.
posted by Pyry at 10:10 AM on September 3, 2009


I think that solving this problem somewhat satisfactorily could be done by simply generating two sets of browser history caches: One cache, listing pages visited since time began, which would only be accessible by the person using the browser, and another history cache, which works exactly the same as the current method in browsers, except limited to storing only pages visited in the last X amount of time, setting X via a user preference (I'd default it to one day, myself).

For security, allow users to set the browser to clear one or both caches automatically on quit.

The advantage here is that the majority of your history would be secure from external snooping, but would remain present for your own use so that you could still use the Awesome Bar to look up that page you visited last week that had the funny LOLcat picture you wanted to send to your mom.

Given the new private browsing features in the latest generation of web browsers this shouldn't be too hard to implement, as it appears that the private browsing thing would be using a separate temporary cache already. All that would need to be done would be to make this temporary cache on by default, then add in a method of writing visited pages to both caches and autoclearing entries older than X from the externally exposed cache. And of course setting the History and Awesomebar and etc. to only search the permanent local cache instead of the temporary one.

Hell I might go suggest this on Bugzilla right now...
posted by caution live frogs at 10:13 AM on September 3, 2009


jsonic: The work around is to simply have your browser clear its history/cookies/etc each time you close it.
That's not a bad idea.

But, if everyone did this, it would make information sniffed this way much more timely and possibly even more valuable. That is, instead of knowing you'd visited emarassingwebsite.com sometime, a malicious site could make a pretty good guess that you'd visited it quite recently. That could be useful to an ad server.
posted by Western Infidels at 10:14 AM on September 3, 2009


(Granted my suggestion would not SOLVE the problem completely - but it would certainly limit the vulnerability by not exposing your entire history!)
posted by caution live frogs at 10:15 AM on September 3, 2009


I've got a client-side style-sheet I use for sites with color schemes I find hard to read, like white text on a black background. Like, oddly enough, whattheinternetknowsaboutyou.com.

With the style-sheet enabled, I get no hits from the JavaScript-based "Top 5k" list, as predicted on the "Solutions" page ("...you can override the default way to show visited links. In principle, this should protect you from having your history detected with Javascript...").

Do they have a demo of the CSS technique?
posted by Western Infidels at 10:25 AM on September 3, 2009


It shows no Mefi visits and only one youporn visit. Is this thing even working?
posted by Cat Pie Hurts at 10:26 AM on September 3, 2009


Anybody know of a well-regarded firefox extension that automatically removes any flash cookies when you close the browser?

BetterPrivacy works for me.
posted by theroadahead at 10:34 AM on September 3, 2009


As some have mentioned, the bigger problem is what snooping parties do with the information once they ferret it out not so much that they can ferret it out. Alas, nosy people are not always benign in their intentions.

That aside, I run on Firefox and have a (somewhat paranoid) habit of clearing recent history several times throughout the day and always after doing banking or making purchases. When I visited the what-the... site, it only picked up the sites I've visited since my last history sweep.
posted by SuzB at 10:36 AM on September 3, 2009


Wow. I use Google.
posted by jeremy b at 10:49 AM on September 3, 2009


A possible solution would be that the browser renders two version of the page: a sanitized version that is not allowed to look at history and a privileged version that can look at history (and maybe other semi-sensitive information)

It wouldn't need to render completely different versions of the page, just report back the link as looking like an unclicked link when querying style attributes in javascript.
posted by delmoi at 11:31 AM on September 3, 2009


It showed Bing.com which I've been to once and no Metafilter pages at all. Fail. Or pass. I'm not sure. -- like_neon

It shows no Mefi visits and only one youporn visit. Is this thing even working? -- Cat Pie Hurts

Ugh, pay attention people. They can't find out every page you've ever visited, they can only test to see if you've visited a certain link. If metafilter isn't in the list, they can't see it. Also, if you've visited once, they can see it.
posted by delmoi at 11:34 AM on September 3, 2009


This is obviously broken; it shows Metafilter as my #2 website for news... oh wait.
posted by headspace at 12:06 PM on September 3, 2009


This seems very easy to fix. When javascript asks the browser for the color property on a link, the browser should return the color from before it did the link-coloring operation.
posted by cseibert at 12:44 PM on September 3, 2009 [1 favorite]


So, there's the really good history sniffing attacks that allow non-destructive verification of browser history.

Then there's the really problematic cache sniffing attacks, where you see how long it takes for a cross-domain image retrieve to succeed. This is destructive -- once you've probed, you lose the evidence -- but stopping this latter variant is really hard.
posted by effugas at 12:49 PM on September 3, 2009


Then there's the really problematic cache sniffing attacks, where you see how long it takes for a cross-domain image retrieve to succeed.

So this cache attack is measuring whether or not the image you want to load is in the user's local disk/mem cache vs. retrieving it across the network? That's pretty evil.
posted by jsonic at 1:19 PM on September 3, 2009


I thought this post was going to be about this, which I now realize has nothing to do with web browser detection, and everything to do with how many people you share a name.

If that sentence looks funny, blame the sleep deprivation. I know I do.

posted by Minus215Cee at 1:51 PM on September 3, 2009


You know what? I'm fine with what the internet knows about me. Thanks for asking!
posted by DarlingBri at 2:22 PM on September 3, 2009


Gee, wasn't it just last week we found out about the Flash LFO's? Hmmmm... just how worried about our privacy are the browser makers, when they keep quiet on this stuff for years?

Anyway, the 'linkstatus' extension (for FF 3.5+) mentioned on the article's 'solutions' page does, indeed, work. IF you check the 'Ignore :visited link style' in the 'options' menu in the 'addons' box. See the linkstatus page.

If it's important to you, this means you won't see different colors for visited links on some pages.
posted by Twang at 3:40 PM on September 3, 2009


You know what? I'm fine with what the internet knows about me. Thanks for asking!

Good for you. For the record, I don't give a shit that random corporate sniffers will know I visit the Craigslist m4m personals, but I can think of plenty of folks who would.
posted by mediareport at 5:01 PM on September 3, 2009


well, if that bothers you, just don't sign any petitions, and they'll never know where you are.
posted by hippybear at 5:38 PM on September 3, 2009


"The internet doesn't know anything about me."

This is good. I run DamnSmallLinux Embedded to get around my boss's firewall at work. I know there are faster ways but I believe in comprimise and my boss is more concerned about cookies and viruses than me wasting time.

As for my personal information, all someone needs to do is look at my mefi profile to get my name, photo, approximate address and occupation. I could care less if corporations are spying on me as long as it doesn't slow my computer down, eat up my hard drive space and as long as I can turn it off if I want too.
posted by Pseudology at 5:55 PM on September 3, 2009


I note that 4chan is classified as Internet "culture"
never have scare quotes been more apropos
posted by Monsters at 8:06 PM on September 3, 2009 [2 favorites]


What the internet knows about me is apparently so disturbing it caused an error on the server.

Fear me, internet.
posted by pompomtom at 9:10 PM on September 3, 2009


odins--

There was a bunch of stuff along these lines floating around back around 2000, see "meantime" for an example of how one gets around cookie blocking. You might also be able to get around destructive read by measuring how many milliseconds it takes for something to come out of cache, and then destroying the image object if it isn't retrieving after that much time. (Remember, you can read image dimensions cross-domain.)
posted by effugas at 9:10 PM on September 3, 2009


nomisxid: doesn't work with lynx =p

from faq:

Q: I use Lynx/Links/w3m/telnet as my primary browser, beat this!

A: You are certainly secure and cool. We'll make a note that our approach breaks down for 40-year-old bearded guys living in their parents' basement.


[Zing!]

posted by koeselitz at 10:35 PM on September 3, 2009


« Older Design On Demand   |   Bacon Day is a day of Bacon Newer »


This thread has been archived and is closed to new comments