The base rate fallacy is the tendency to ignore base rates in the presence of specific, individuating information. Rather than integrating general information and statistics with information about an individual case, the mind tends to ignore the former and focus on the latter. This phenomenon is widespread – and it afflicts even trained statisticians, notes American-Israeli psychologist Daniel Kahneman in his book Thinking, Fast and Slow. He illustrates this common reasoning flaw early in the book with the following example. Kahneman writes:
An individual has been described by a neighbor as follows: ‘Steve is very shy and withdrawn, invariably helpful but with little interest in people or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail.’ Is Steve more likely to be a librarian or a farmer?
You are like most if your intuitive answer to this question was that Steve is more likely a librarian. As Kahneman writes: “the resemblance of Steve’s personality to that of a stereotypical librarian strikes everyone almost immediately”; however, the fact that, in the United States, there are “more than 20 male farmers for each male librarian” is almost always neglected in our calculations (p. 7).
One type of base rate fallacy is the false positive paradox, in which false positive tests are more probable than true positive tests. This result occurs when the population overall has a low incidence of a given condition and the true incidence rate of the condition is lower than the false positive rate. This might be counter-intuitive, but consider the following common example:
A city with a population of 1,000,000 people implements surveillance measures to catch terrorists. This particular surveillance system has a failure rate of 1%, meaning that (1) when a terrorist is detected, the system will register it as a hit 99% of the time, and fail to do so 1% of the time and (2) that when a non-terrorist is detected, the system will not flag them 99% of the time, but register the person as a hit 1% of the time. What is the probability that a person flagged by this system is a terrorist?
Someone making a base rate fallacy would say that there is a 99% chance of that person being a terrorist. At first, given the system’s failure rate of 1%, this prediction seems to make sense; however, this is an example of incorrect intuitive reasoning because it fails to take into account the error rate of hit detection. Deconstructing the false positive paradox shows that the true chance of this person being a terrorist is closer to 1% than to 99%.
This image (by Ali Al-Awadi) illustrates the false positive paradox at work: Out of one million inhabitants, there are 999,900 law-abiding citizens and 100 terrorists. The number of true positives registered by the city’s surveillance numbers 99, with the number of false positives at 9,999 – a number that would overwhelm even the best system. In all, 10,098 people total – 9,999 non-terrorists and 99 actual terrorists – will trigger the system. This means that, due to the high number of false positives, the probability that the system registers a terrorist is not 99% but rather is below 1%. Searching in large data sets for few suspects means that only a small number of hits will ever be genuine. This is a persistent mathematical problem that cannot be avoided, even with improved accuracy.
This hypothetical, ineffective surveillance system can be compared to the real systems that govern the lives of Europeans today. One such example is the mass retention of passenger data in the EU. In the United States, the retention of passenger name records (PNR) was introduced in the aftermath of the 2001 terrorist attacks on New York City. Though meeting resistance in Europe from civil society and liberal politicians, after five years of negotiations the PNR Directive was adopted by EU member states in 2016. The collection of these records is aimed at the prevention, detection, investigation and prosecution of terrorist offenses and serious crime.
In Austria, the Austrian Passenger Information Unit (PIU) has processed PNR since March 2019. This week, the Passenger Data central office (Fluggastdatenzentralstelle) issued a response (in German) to inquiries into PNR implementation in Austria. According to the document, from February 2019 to the 14th of May 7,633,867 records had been transmitted to the centre. On average, per day about 490 hits are reported, with an average of about 3,430 hits requiring further verification per week. According to the document, of the 7,633,867 reported records there were 51 confirmed matches and in 30 cases there was the intervention by staff at the airport concerned.
What this small show of success does not capture, however, is the damage inflicted on the thousands of innocent passengers who are wrongly flagged by the system and who can be subjected to damaging police investigations or denied entry into destination countries without proper cause. As in the hypothetical example above, mass surveillance that seeks a small, select population is invasive, inefficient, and counter to fundamental rights. It subjects the majority of people to extreme security measures that are not only ineffective at catching terrorists and criminals, but that undermine privacy rights and can cause immense personal damage.
Security and privacy are not incompatible – rather there is a necessary balance that must be determined by a society. The PNR system, by relying on faulty mathematical assumptions, ensures that neither security nor privacy are protected.
Read more about PNR here and here, and stay updated with information about our work with the German group Society for Civil Rights – Gesellschaft für Freiheitsrechte e.V. (GFF) – to challenge this data retention here.