Why Data Mining for Terrorists Will Never Work
Through initiatives such as "Secure Flight," the U.S. government assures us that data mining makes us safe from terrorists.
Under this initiative, you won't be able to obtain a boarding pass for a flight to, from, or within the United States unless you receive permission to travel from the Transportation Security Administration.
But the government is lying to us. Data mining will never be effective to identify terrorists.
Here's why:
Data mining analysis defines how an individual fits into a group, and predicts that person's behavior based on characteristics of that group.
For instance, under Secure Flight, the TSA will analyze your credit records, your travel history, your bank records, your credit card records, your telephone records, your Web surfing records, and many other types of records to determine if you pose a terrorist threat.
If you "pass" the TSA analysis, you'll receive a boarding pass. If you don't, you won't be able to travel by air, even within the United States.
There's only one problem, other than the giving the government carte blanche over our personal data, with zero accountability for its misuse. Data mining for terrorists doesn't work. And it never will.
Terrorists don't fit an easily identifiable profile. While most terrorists are male and under 40, nearly two billion people fit this profile worldwide. There are also an exceedingly small number of actual terrorists, and they deliberately obscure their trail to avoid detection.
These factors make data mining to identify terrorism an expensive waste of time. One analysis by security expert Bruce Schneier estimated that even with 99.9% accuracy, data mining for terrorists would generate one billion false alarms for every real terrorist plot it uncovers.
For some applications, though, data mining does work. It works best when there's a well-defined profile of whatever you're searching for, a substantial number of "events," and minimal consequences for "false positives."
An example of an effective application for data mining is credit card fraud. All credit card companies now data mine their transaction databases, looking for patterns of spending that might indicate a stolen card.
Since a credit card thief generally purchases a large number of expensive items shortly after the theft, it's possible to identify fraud with a high degree of accuracy. The consequence of a false positive—mistakenly identifying a credit card as stolen—is that the legitimate owner temporarily can't use it. But this is a problem only until the rightful owner contacts the credit card issuer to inform them of the mistake.
The federal government surely knows these facts. Yet, it persists in claiming that data mining will somehow help identify terrorists. Why?
It turns out that looking for other types of people who are not as rare as terrorists is much more plausible using data mining technologies. For instance, lots of people don't approve of the way the government is fighting the so-called War on Terrorism. Some of these people may subscribe to publications that criticize the War on Terrorism; make phone calls to other people who don't like it, etc. Since all of these records are "mined" by various federal agencies, it would be easy for the government to use this information to identify opponents of this war.
In other words, while data mining is almost useless for identifying terrorists, it's an effective way for the government to engage in political intelligence gathering. And that's how I think it's being used.
Copyright © 2007 by Mark Nestmann



