Ah, so that's why there's free wifi on the subway
June 4, 2018 12:47 AM   Subscribe

"Cities worldwide face the problems and possibilities of “volume”: the stacking and moving of people and things within booming central business districts." "With the introduction of ticketless transport cards it’s now possible to gather more data about people flow through busy transport hubs as we tap on and off. Tracking commuters’ in-station journeys through their Wi-Fi enabled devices, such as smart phones, can also offer a detailed picture of movement between platforms, congestion and delays."

From the story: "Governments are also increasingly turning to consultancy firms that specialise in simulation modelling of people flow. That’s everything from check-in queues and processing at terminals, to route tracking and passenger flow on escalators."

Also, list of metro systems under construction (unsurprisingly, more than a quarter are in China).
posted by wallawallasweet (24 comments total) 14 users marked this as a favorite
 
Pre-smartphones, there were attempts to track people movement via cameras (which usually relied on an unlimited number of students doing human-driven facial recognition), but the main way was self-reporting, with its inherent biases and self-selecting sample problem - this has been a revelation in the ability to test modelling accuracy. Mind you, this has yet to fully translate into transport infrastructure design; in my experience it's more often used for sports stadiums where a very large volume of people has to be able to leave at one time in an orderly manner.
posted by I claim sanctuary at 3:12 AM on June 4, 2018 [2 favorites]


Yet another reason for me to want a nice looking leather-bound lead case to put my cell phone, laptop, credit cards, and passport in. (It wouldn't have to be leather, I do like black canvas also.) I remember Snowden put phones in the freezer compartment of the refrigerator in the hotel rooms he was in, and while that seems a good solution I do think a nice shielded case might work out better than carting around a refrigerator.

Orwell in 1984 had it all being done by force, and the motivation being fear. Huxley, in Brave New World, had it all being done by people being seduced by the conveniences of the tech toys. Both are right but I'm leaning in recent years towards Huxley as having made a better call -- I actually pay to have a device which will let govt (and Apple, and Google, and Microsoft, and Amazon, and and and and) know to within four feet where I am every minute of every day.
posted by dancestoblue at 3:48 AM on June 4, 2018 [5 favorites]


I first read this as "that's why there's free will on the subway", which surprised me.

I've spent some time with a few of our local big data enthusiasts, and not once did any of them express any concerns about misuse. Like, it's not even a concept for them. I try to come up with at least one nightmare scenario at every meeting, but don't appear to be making any progress.
posted by Mogur at 3:55 AM on June 4, 2018 [2 favorites]


People: OMG my movement is being tracked by my phone, this is an intolerable invasion of my privacy that must be stopped!

Also people: I'm sure glad that Google Maps / Waze steered me clear of that horrible traffic jam this morning.
posted by Jacqueline at 4:06 AM on June 4, 2018 [28 favorites]


Funny you should mention that...
posted by ook at 4:29 AM on June 4, 2018 [2 favorites]


I've spent some time with a few of our local big data enthusiasts, and not once did any of them express any concerns about misuse. Like, it's not even a concept for them. I try to come up with at least one nightmare scenario at every meeting, but don't appear to be making any progress.

There’s no reason related to transport administration for having RFID travel cards be identified. You can still have each card have a unique identifier, you can still get your millions of data points and do your tasty analytics which are certainly very helpful for transport planning, but none of that requires that a a person’s identity be linked to the travel itself.

The only reason that travel cards are linked to identities is so that law enforcement can use public transport as an intelligence resource. In Sydney, it’s virtually impossible to take public transport anonymously now. You are heavily steered to linking the travel card to your credit card to enable automatic top ups. You can top up your card manually, but for the first 18 months no machines were available to do this in train stations so, by design, there was little choice. Now, there are machines for manual top ups and single journey tickets, but generally only one even in major high traffic stations - leading to long lines. As a result over 90% of travel cards are linked to an identified travel account. The cops must be salivating.

But I’ve never seen any proof of the utility of having this data identified. Law enforcement wants everything all the time, but they never want to demonstrate the value. And meanwhile, agencies and industry, forced to collect this data at great expense, start looking for ways to monitise it. So far, they haven't found anything better than advertising. But rest assured, when they figure something else out, they won’t hesitate.
posted by His thoughts were red thoughts at 4:43 AM on June 4, 2018 [10 favorites]


Ah, so that's why there's free wifi on the subway

Being connected to a Wi-Fi access point isn't necessary to track you; it's just what Transport for London is taking as an indicator of your consent to be tracked. From 2012: “If You Have a Smart Phone, Anyone Can Now Track Your Every Move”. That article only mentions Wi-Fi but I'm pretty sure a phone can be tracked just by its broadcasts to communications towers too.

Also, due to what seems like a completely stupid or malicious design flaw in Wi-Fi (I haven't read up on the technical details thoroughly) not only can your current location be tracked but someone with a database of Wi-Fi network SSIDs and their locations can gather a list of all the places you've ever connected to Wi-Fi if they're still in your “preferred networks” list. There's a 2014 TEDx talk by a Belgian PhD candidate researching this in which he dramatically reveals a world map showing where everyone attending the event has been (anonymized) to a chorus of gasps; a paper (PDF) he wrote with others on the problem (summary version (PDF)); and his dissertation (PDF) submitted last year has a section “What the smartphone user can do”—there's an Android smartphone app called WiFi Privacy Police he created (also on F-Droid) which tries to mitigate the amount of private information leaked this way but the measures for iOS don't seem like something non-technical people can do easily.
posted by XMLicious at 4:52 AM on June 4, 2018 [7 favorites]


Furthermore, there was an FPP back around the turn of the century about a company called Mobiltrak which had a technology that could determine what radio station a car's stereo system was tuned to, even if it was turned off; so perhaps there's some similar way to get a unique fingerprint from the reflections off of a mobile phone, and that's why Edward Snowden kept his phone in a freezer/Faraday cage. (As I understood it, he removed the battery before he did this.)
posted by XMLicious at 5:17 AM on June 4, 2018


That article only mentions Wi-Fi but I'm pretty sure a phone can be tracked just by its broadcasts to communications towers too.

At least in 2007, when I was clerking in London for a solicitor who did criminal defence, this tracking technique was in active use by the Crown Prosecution Service and the Met Police.

That was the summer of the original iPhone's release and our clients were using cheap burners with cash-prepaid cards, although I don't remember how effective that was as a countermeasure to cell tower triangulation.
posted by gauche at 6:37 AM on June 4, 2018 [1 favorite]


Also people: I'm sure glad that Google Maps / Waze steered me clear of that horrible traffic jam this morning.

Well, yeah, but you can monitor traffic mostly anonymously, right? That's what people want: monitor traffic levels generally but don't you dare track my individual daily movements.
posted by pracowity at 7:24 AM on June 4, 2018


I can see utility in "semi-identifiable" device ID-based tracking. For example, if you know a certain set of devices go from Station A to Station B, in the morning, then Station B to Station A in the evening, that gives you a good sense of how many people are taking that route to commute, and you can adjust capacity accordingly.

Of course, it would be _better_ to do this via the farecard, like on the DC Metro. Tap the card to get in, tap the card to get out, but it's a variable fare based on distance. In New York City, it's a single-fare for the subway, so you swipe in and they know MetroCard X was used at station A, but they don't know you got off at Station B. (There's enough problems with NYC's subways, though, that merely having route data isn't going to fix.)

--

As an additional data point, if a smartphone is going to be the new farecard, Apple's implementation of RFID payments uses generated card numbers as a front for whatever credit/debit card it's linked to. The merchant doesn't see the actual card number of my credit/debit card, but the generated number. (There's a bit more to it than that, but I'm fuzzy on the details.) So, if a subway RFID farecard in software uses a similar implementation, you don't get the ID based tracking.
posted by SansPoint at 7:25 AM on June 4, 2018


Howdy, your friendly transpiration planner here. While it is (or was) possible to track individual cell phones in real time with a ridiculous amount of ease, on the system/ network/ transportation planning side, we don't care about or want that individual information.

That's what people want: monitor traffic levels generally but don't you dare track my individual daily movements.

Except that's how it works -- individuals are aggregated into anonymous* groups, and that is what is useful for planning improvements to facilities that move people, be they the physical roads, streetlight timing, or subway capacity. You can't get generalizations without tracking the individual, at least with how these big data sets are currently packaged. Without having individual information, you can't be certain about what volume of people are moving from a given Point A to a given Point B, what paths are taken, how long those commutes take, and where there are bottlenecks.

But we, the end data users, are not interested in making life better (or worse) for Joe Schmo who lives at 123 Fake Street, we care about the 100-500 people who live around Joe, when they leave for work, which directions they go, and how they join and split off from other streams of people in and around some system. We're not looking at the single points moving through the system, it's a 10,000 foot view, which makes the individual dots into a bit of a blurred image. And there are layers of access before I get to see that blurry image, from the cell phone company who does know where your cell phone is (because it has to, to provide you cell service, though that "knowing" is automated), to the data aggregators, who buy that Big Data and make it usable for other purposes, and on to me, an end user who can purchase the data for various uses only because it's aggregated.

* One caveat - you're more anonymous if you're in a crowd, because with fewer points of data, it's easier to say with some certainty "this dot is this person, or this vehicle," so the data aggregation companies won't send data if it's not in a big enough group. In other words, data from less populated areas is generally not available in the same detail or resolution as more populated areas, to keep the data anonymous. That's why I can't get much business travel data for individual business sectors in the rural southwest, because there's only one company who moves That Kind Of Thing in this county, so if we tracked the movement of That Kind Of Thing, we'd know who was moving it.

Final note: this isn't my area of expertise, but I'm working on travel demand modeling right now, so I'm learning more about what we can get, and what it can tell us.
posted by filthy light thief at 7:48 AM on June 4, 2018 [11 favorites]


People: OMG my movement is being tracked by my phone, this is an intolerable invasion of my privacy that must be stopped!

Also people: I'm sure glad that Google Maps / Waze steered me clear of that horrible traffic jam this morning.


but why must I accept the former in order to receive the latter?

But we, the end data users, are not interested in making life better (or worse) for Joe Schmo who lives at 123 Fake Street, we care about the 100-500 people who live around Joe, when they leave for work, which directions they go, and how they join and split off from other streams of people in and around some system. We're not looking at the single points moving through the system, it's a 10,000 foot view,

always, forever? Or just for now, the day-to-day in springtime 2018? I mean, it's not as if BIG DATA (whether private or gov interest) has given me any confidence whatsoever to trust what it does, what it's done, what it will do with my personal info.
posted by philip-random at 8:05 AM on June 4, 2018


I remember Snowden put phones in the freezer compartment of the refrigerator in the hotel rooms he was in, and while that seems a good solution I do think a nice shielded case might work out better than carting around a refrigerator.

In Oliver Stone's movie, he and his cooperators are shown putting phones in a microwave oven (turned off of course), which is apparently a Faraday cage. Maybe they decided this was easier to frame onscreen than the freezer thing?

Putting a phone in a freezer should work, though, by stopping the battery.

First-best solution would be to put it in a microwave inside a freezer.
posted by grobstein at 8:26 AM on June 4, 2018


The only reason that travel cards are linked to identities is so that law enforcement can use public transport as an intelligence resource. In Sydney, it’s virtually impossible to take public transport anonymously now. You are heavily steered to linking the travel card to your credit card to enable automatic top ups. You can top up your card manually, but for the first 18 months no machines were available to do this in train stations so, by design, there was little choice.

You zoom past the actual reason in the middle of your conspiracy theory. Public transport agencies try to get people to link their travel cards to a credit card for automatic top ups for the exact same reason that everybody else on the planet Earth tries to get people to automatically bill their goods or services, because it lowers the perception of cost. Gyms have been selling memberships, automatically billed to credit cards, for decades not because the cops are salivating over gym membership data but because most people will just get billed. Your local bartender will let you start a tab not because they are in cahoots with law enforcement, but because you're more likely to buy another pint if you don't have to dig out your wallet.

We've known for decades that requiring people to put cash on the barrel every time is the most painful way to get them to pay. Drivers don't have to do that - some people remember how much and frequently they pay for gas, but most of the costs of driving that you pay (depreciation, insurance) are billed very indirectly. If you had to feed quarters into a slot on your dashboard to make your car go, people would drive an awful lot less, and that's basically how transit has historically been billed.

The other benefit -- which, as you note can be done from anonymized data and in fact is usually done with anonymized data -- is for transit agencies to do planning. It's remarkable how little data agencies have without fare cards. Transit agencies don't know how many riders they have. Like, they're supposed to provide a service but they don't actually know how many people are using it. The industry standard metric is "boardings", which means people who get on a vehicle. (They actually don't always know that either; they do know how many fares they sell, but most ridership is from pass users so there are some assumptions/sampling issues.)

This can cause any number of other difficulties -- for instance, one thing that some agencies have done recently is reorganize their bus routes from a maze of everywhere-to-everywhere routes into a grid of more frequent service, with more transfers. The number of boardings after such a reorganization goes up, but without travel card data it's not possible to tell whether that's because more people are taking the more frequent and direct service or if the existing riders are just forced to transfer more. Like, can you imagine any business completely changing the way they operate and afterward not being able to say whether they have more or less customers as a result?

I haven't worked with every transportation agency on the planet, but I have worked with agencies in three countries and everybody I've worked with is very serious about the privacy aspects of transportation data. They don't work on behalf of the cops; it's rare to even get the bus people cooperating with the roads people -- sometimes even the bus people with the train people. (And don't start me on the fire department.) There are also generally more stringent laws on government storage of data versus what the private sector is bound by - especially once you've clicked on the terms and conditions button. And government is much more likely to actually obey the laws, as opposed to Uber.

Like, I can't say that there are no bad actors in the world's transportation agencies, but being worried about the transit agency over Google is like being in a leaking boat in the middle of the ocean worried about whether it might rain.
posted by Homeboy Trouble at 8:32 AM on June 4, 2018 [14 favorites]


Just to make it clear, there are two different kinds of tracking under discussion here: the telecommunications companies themselves have always had the capability to track a phone's location, if only by which tower it's connecting to—simply to be able to route incoming calls to you, that's necessary—and that capability rapidly became more precise through methods like triangulation and designing the hardware and software of the handset to cooperate with being located.

But the OP article's discussion of Wi-Fi, and my speculation above about tuning in to the phone's communication with the tower, concerns just anyone you walk past or who places the necessary sensors in an area being able to track you. If you look at this video from the 2012 article I linked to above, it shows one guy following another guy around a large convention center using only an app on his phone.

The telecomm company that provides your service is an entity you have a contractual business relationship with and at least theoretically is responsible to you in some way. But the way the technology has been set up, your device allows any third party to track you.

So even if telecomm companies were restricted by regulation to limit how they share data, and even if they plug the stupidest security breaches in their own systems like the one mentioned in the ZDNet article filthy light thief linked to, the effect on the ability to track you via your phone will be negligible.
posted by XMLicious at 8:35 AM on June 4, 2018


I believe, based on a brief exchange with someone in the big city metro biz, the London project is the bleeding edge, and the article is about what may happen, not what is happening now. Also, it's going to be harder in the US for some sort of technical reason.

On the other hand, who knows what's done in a more closed situation like an airport, or a more private one like Disneyland or an Apple store.
posted by SemiSalt at 10:01 AM on June 4, 2018


SemiSalt: In an Apple Store? Probably not much. Apple doesn't do much in the way of data collection.

Disneyland? They literally sell tracking devices just for Disney parks.
posted by SansPoint at 10:10 AM on June 4, 2018


philip-random:

Also people: I'm sure glad that Google Maps / Waze steered me clear of that horrible traffic jam this morning.

but why must I accept the former in order to receive the latter?


Because in the eyes of Google/Waze, you're getting all the benefits (congestion information) with none of the costs (telling others how congested your path is).


always, forever? Or just for now, the day-to-day in springtime 2018? I mean, it's not as if BIG DATA (whether private or gov interest) has given me any confidence whatsoever to trust what it does, what it's done, what it will do with my personal info.

As an end-user of Big Data, I'd like to hold onto the aggregated data for a while, but again I'm not seeing individual points, and the data I see is fuzzy (for example, travel time data may say "for system user[s], it took 26 seconds to travel from Point A to Point B between 9:05-9:10 AM on June 4, 2018; it took system user[s] 28 seconds between 9:10-9:15, etc." -- and I may or may not know how many users there were, and if I do, it's also in groups -- 1-2, 3-5, more than 5, so you can know how much confidence to have in the travel data).

As a Big Data collector and aggregator, you're likely keeping a TON of disaggregated data for a long time, but it's anonymous - just a unique identifier of some sort, which can then be used to tell a bigger picture.

Back to me as an end-user, I try to re-create a bigger, still hazy picture of who (the neighborhood) of you are, based on hazy census block data, to say what kind of people are traveling where, to better understand how system modifications might impact different populations, to ensure some sort of Environmental Justice ("the fair treatment and meaningful involvement of all people regardless of race, color, national origin, or income with respect to the development, implementation, and enforcement of environmental laws, regulations, and policies" -- though the need to focus on environmental laws aren't critical -- EJ is a helpful shorthand in the planning field).

Also, it's expensive to as a Big Data end user to buy all this data, so I'd only do it every few years. But I'm guessing that most privacy concerns aren't aimed at the transportation/ system planners who want to ensure investments are going to the right places with the use of location and travel information. Despite the scary sound of "knowing where you are in the system," the individual points are the goal, it's the picture that the points paint.

And to use the location of your wi-fi enabled device to identify you as an individual can take a good bit of work, or access to other Big Data databases that tie your device identities to you as an individual.
posted by filthy light thief at 10:22 AM on June 4, 2018 [4 favorites]


Whether your transport card is linked to your credit card / bank account is largely immaterial from a privacy perspective, if someone is actually interested in tracking you. It might affect the searchability of the database to a casual user (or the inevitable security breach) and for that reason isn't nothing, but if your concern is government-level actors it's only one piece of the puzzle, and they have lots of other pieces.

If you use the same transit card every day for a few weeks, even if the card number is initially randomly generated, it will probably be pretty easy to disaggregate the data and start identifying people based on the uniqueness of travel paths over time. Combine that with CCTV footage and you could quite easily build up (even if the CCTV footage is wide-area shots) facial or gait recognition profiles which are matched to transport cards. Or combine it with data from the parking garage (vehicle entries and exitways are inevitably recorded, generally with good cameras and ANPR systems) and you'd have a license plate number. I'd bet you anything they're either already doing this or will be doing it in China within the year (or claiming they can do it; it's hard to say what's real and what's window-dressing).

Even if you don't have access to other data, if you have the ability to perturb the system you can do interesting things to aid disaggregation. E.g. if you do something that creates a distinct blip in that user's usage pattern, above the normal noise floor, you can pick them out and find their card number or other identifier. A flat tire might do it, if they have an otherwise regular pattern and tight schedule. You'd just need to comb through the data for a user who was late on that particular day by the amount of time that you added via the perturbation. And even without introducing any additional inputs, it would only take a small amount of surveillance, with access to the backend system, to get a particular user's card number (by watching them on a single day to see when they pass through fare gates and then looking for that entry/exit match). It's basically a force multiplier: you do one day of close surveillance and you get some large amount of data, however long they have been using that card.

Personally I don't think arguing over the ethics, morality, or legality of this sort of thing is useful. It is possible, so it will be done. We live in a world where there are hydrogen bombs capable of wiping out whole cities; if we can't collectively decide that those are a bad idea, and have them in spades, I see zero hope of the widespread agreement that it would take to prevent a race-to-the-bottom for electronic surveillance. And even if it is illegal, that doesn't mean it can't or won't be done, just that it will be done more secretly. Eventually, electronic surveillance will have its Edward Teller, someone capable of pitching the idea in all its glorious horror to the right people with the right means, and we'll be off to the races.

More useful, I think, are developing ways to defeat automatic surveillance. Algorithms that are designed to detect patterns in regular data might be thrown off by introduced randomness—having 8 transit cards, and rolling a dice each morning to see which one you'll use, could keep you from getting disaggregated, if the algo was built on the assumption that each person has one or two cards at most. Many deep learning systems can be thrown off by minor introductions of unexpected data (these attacks generally refer to image recognition systems but have broad application). The different ways that people and machines "see" and "think" make it possible to do lots of things that might seem normal, but would confuse a machine. The more of this research is done, and done out in the open, the more power individual people will have against totalizing machine surveillance. The less of it that is done, the more one-sided things will be. Because there are people who are going to go down this road, and are in fact going down it like hell, on wheels greased with vast money and resources. There's no opting out at this point.
posted by Kadin2048 at 11:31 AM on June 4, 2018 [3 favorites]


...the article is about what may happen, not what is happening now. Also, it's going to be harder in the US for some sort of technical reason.

There isn't any reason I'm aware of that the basic capability of tracking phones in the manner described in the OP article wouldn't be possible or wouldn't already be widely deployed in the U.S.; perhaps your acquaintance was referring to some more sophisticated feature of the London system.

The company Navizon mentioned in the 2012 article I link to above was already selling their systems at that point. Both that article and their current web site link to this public demo which is supposed to display anonymized data about phones passing near to their headquarters in Miami, but is currently giving me some sort of permissions error. Also, the live webcam feed visible in the 2012 screenshot of the page appears to be absent.

Any organization with a widely-distributed set of Wi-Fi hotspots or similar devices could be doing this on a large scale. A few years ago a roommate asked me to set up a free combination cable modem and Wi-Fi router sent to us by the cable company; I noped right out of that and dropped it off at one of their service centers, but of course many of our neighbors also received them. Unsurprisingly it turned out that the "free" routers (which I think they charge you a monthly rental fee for as well) allow anyone within range who has an account with the cable company to get internet access via your cable, but there's no reason they can't be doing much more than that.

In addition to marketing for the "Indoor Triangulation System" that tracks people as they walk around inside a building and information on other enterprise products Navizon's current website says,
Navizon's global database of Wi-Fi access point and cell id locations was built through an innovative crowdsourcing program by a community of more than one million registered users from all over world. Thanks to its innovative technology and incentive programs, Navizon's user community has grown virally from 50,000 to more than a million users in approximately one year. In addition to covering all the wireless technologies (Wi-Fi, GSM, CDMA and 3G towers), Navizon's database is truly dynamic, being updated with a flow of over 500,000 data points every day.
Even in areas where there aren't any hotspots, any app with access to phones' transceivers and their own location data can collect the same information about other nearby phones.

The Navizon app's users are paid to collect and upload data on stationary Wi-Fi access points, nominally for use with their hybrid positioning system services that are used to increase the accuracy of conventional GPS. It's also just one company that publicly does this stuff: the Wikipedia article on indoor positioning systems lists many others, some which provide similar apps, and links to academic papers developing and refining these techniques during the past couple of decades.
posted by XMLicious at 12:08 PM on June 4, 2018


"There isn't any reason I'm aware of that the basic capability of tracking phones in the manner described in the OP article wouldn't be possible or wouldn't already be widely deployed in the U.S.; perhaps your acquaintance was referring to some more sophisticated feature of the London system."


The two transit systems with which my source has some knowledge are still working out how to use the fare card info to best advantage.
posted by SemiSalt at 3:18 PM on June 4, 2018 [1 favorite]


Spelunking through the OP links a bit I found a Gizmodo UK article from a bit more than a year ago describing a group of documents obtained via a series of Freedom of Information Requests from the National Gallery and Natural History Museums in London and the National Railway Museum in York. Using the Wi-Fi tracking technique they recorded an analyzed their visitors' movements and how long they paused to look at each exhibit.
posted by XMLicious at 9:35 PM on June 4, 2018


Forget subway wi-fi. Over the last few years, NYC has installed free wifi terminals all over the city that most certainly track you forever, even if you sign into them once, and never use them again.
posted by rokusan at 4:13 PM on June 6, 2018


« Older git merge microsoft github   |   Mainly a weed, Mainely a fish? Newer »


This thread has been archived and is closed to new comments