September, 2014 was unusually cold for Boston. The chill from the Charles drifted in through an open window and slid across the floor of the one-bedroom apartment. Special Agent Ross took in the mess of books, records, and DVDs scattered around the room, framing the dead body of Janet Somers.
Back at the office, he started filing in the forms of the homicide report, detailing the execution-style killing—one shot to the back of the head—and signs of struggle. He’d barely had time to fetch a fresh cup of coffee when the PDA on his hip started buzzing. “Ross.” he answered curtly, annoyed by yet another interruption in what was already looking like a long day.
“This is Mike Lynch. I work for the U.S. Marshals,” explained the caller. “You filed the homicide report on Janet Somers, right?”
“Sure. You knew her?” asked Ross.
“Detective Ross, I work in Witness Protection. It’s not that I knew her, so much as I tried to make the world forget her. Based on what happened today, it sounds like I failed,” said Lynch.
“Yeah, it looked like a professional job: one shot to the back of the head at close range. What’s up?” asked Ross.
“I need you to tell me what music Janet listened to,” Lynch replied.
Ross scowled at the thought of spending hours in Janet’s frosty apartment. “Oh, come on. She’s dead. What difference does that make?”
“Listen, we’ve had six other killings this week. Same M.O. I need your help here.” replied Lynch.
And I thought I was having a bad week, muttered Ross under his breath. “All these victims were under your protection?”
“Nope, none of them were,” said Lynch. “But the other six victims were all women in their mid-thirties, like Janet. They were all killed, execution-style, late at night, just as Janet was. And most importantly, all six of them liked the same music.”
Ross took a sip of coffee and furrowed his brow. “I don’t get it,” he said, “what does their music have to do with it? You don’t shoot people for bad taste.”
“Listen, Ross,” growled Lynch, clearly annoyed. “If Janet liked the same music as the other six, then that means the killer’s using wishlists, iTunes, Last.fm and other online services to find people we’ve spent years working to hide.” The Marshal paused, letting it sink in. “We can change their faces, their cities, their jobs — but we can’t stop them listening to shitty music. And that might just get them killed.”
The US Witness Protection Program has hidden nearly 20,000 people since it was launched in the 1970s. So far, nobody in its custody has been harmed, despite Hollywood’s love of this plot device. Witnesses change their names, their appearance, and even their jobs — anything to hide their past. But can we hide who we really are?
Hiding identity gets much harder in a connected, declarative world. When you tell the world about yourself, you give society an unmistakable set of fingerprints. It’s easy for others to use those fingerprints, whether you’re just hoping for song recommendations or trying to track down a fugitive.
The plot device described above isn’t that implausible. If someone can access your tastes and preferences—say, dear reader, at an estate sale for Janet’s faked death—they can likely find people with similar tastes to you. Which probably includes you, in hiding. Your only alternative is to avoid the grid entirely, and in a socially connected world, that looks suspicious, too.
Consider the following examples:
- When America Online released a million search entries that had been anonymized so that researchers could analyze them. Reporters soon found that using only a few searches, they could identify one of the people whose searches were part of the log.
- A 2008 University of Texas study showed that data from the Netflix Challenge could be used to find people based on their movie preferences.
- A University of Texas study showed that 30 percent of Twitter accounts could be linked to Flickr accounts held by the same user, even though personal information was stripped from both.
There’s nowhere we are more honest than in our search box, and that honesty makes us easy to find, particularly as search services add history and personalized results.
Ultimately, our online fingerprint is a unique ID. Our surfing patterns, writing styles, and purchasing behavior are hard for others to fake—and for us to cover up. This could be the basis for fraud detection, or for new approaches to security clearance, or for tracking down fugitives (something we touched on in a panel at Gov 2.0 in May.)
Legislation today covers the leakage of Personally Identifiable Information (PII)—unique identifiers such as social security and passport numbers. But with the right analytical tools, enough of any information is personally identifiable. This is a far more subtle, and harder to legislate, form of privacy violation, particularly since we share such information willingly every time we do something.
One easy target for legislators is the web cookie, a unique string of numbers used by websites to identify requests from the same user. Germany has been particularly vigorous in regulating this sort of thing, effectively saying it’s illegal to use Google Analytics in Germany. The US Congress is looking into the matter. Unfortunately, legislators choose easy targets—like the web cookie—rather than addressing the bigger issue of a digitally promiscuous population whose life is lived in public. And using predictive analytics of the kind being developed for criminal investigators takes it one step further.
We’ve already seen proof that anonymized online data can still be traced back to individuals. We can’t put the Sharing Genie back in the bottle, and the legal system will take decades to catch up.
In the meantime, hack mystery writers everywhere have exciting new plot devices to play with.