A recent paper by the Technology Policy Institute takes a pro-business look at the Big Data phenomenon, finding “no evidence” that Big Data is creating any sort of privacy harms. As I hope to lay out, I didn’t agree with several of the report’s findings, but I found the paper especially interesting as it critiques my essay from September’s “Big Data and Privacy” conference. According to TPI, my “inflammatory” suggestion that ubiquitous data collection may harm the poor was presented “without evidence.” Let me first say that I’m deeply honored to have my writing critiqued; for better or worse, I am happy to have my thoughts somehow contribute to a policy conversation. That said, while some free market voices applauded the report as a thoughtful first step at doing a a Big Data cost-benefit analysis, I found the report to be one-sided to its detriment.
As ever in the world of technology and law, definitions matter, and neither myself nor TPI can adequately define what “Big Data” even is. Instead, TPI suggests that Big Data phenomenon describes the fact that data is “now available in real time, at larger scale, with less structure, and on different types of variables than previously.” If I wanted to be inflammatory, I would suggest this means that personal data is being collected and iterated upon pervasively and continuously. The paper then does a good job of exploring some of the unexpected benefits of this situation. It points to the commonly-lauded Google Flu Trends as the posterchild for Big Data’s benefits, but neglects to mention the infamous example where Target was able to uncover a teenage customer was pregnant before her family.
At that point, the paper looks at several common privacy concerns surrounding Big Data and attempts to debunk them.
First, TPI rejects FTC Chairwoman Ramirez’s claim that Big Data increases the risks associated with data breaches and identity fraud. The paper narrowly missed the start of Target’s massive consumer data breach, and it is easy to merely point to this cybersecurity disaster and move one. However, TPI’s argument is a bit more nuanced, suggesting that there’s no indication that the incidence of data breaches and identity fraud have increased with the rise of Big Data. TPI concedes that the number of annual data breaches are trending upward, but suggests that “data breaches are purely an online phenomenon, so it is appropriate to deflate them by a measure of online activity,” TPI writes. In other words, TPI looks at the the explosion in online commerce compared to the increases in data breaches, and finds that the value in eCommerce dollars outweighs the financial costs of data breaches. That’s cold comfort to the 110 million people who just had their data compromised, but TPI’s argument also mischaracterizes Chairwoman Ramirez’s concerns. TPI wants to compare data breaches to economic output rather than their potential to harm individual consumers.
Second, TPI discards “Data Determinism” and instead focuses on the benefits of algorithms. It is true, as TPI points out, that the use of test scores or credentials and other “small data” are used to make decisions every day. TPI makes the logical assertion that “if more data points are used in making decisions, then it is less likely that any single data point will be determinative, and more likely that a correct decision will be reached.” Perhaps. The problem is that TPI provides few protections to individuals. Like many in industry, TPI argues that traditional Fair Information Practice Principles do not work in a Big Data world, but TPI moves away from (1) notice and choice, (2) use specification and limits, and (3) data minimization. Additionally, one of the few FIPPs that would compensate for this shift, a transparency principle is also discarded. “Greater transparency is a questionable remedy,” TPI argues, suggesting that “monitoring is at best a two-edged sword.”
This assertion is beyond radical. Individuals must have some basic information about how entities are using their information to make decisions about them lest we want to face a Kafkaesque machinery that manipulates based on opaque justifications — and that is from the Future of Privacy Forum, which is an industry-sponsored think tank.
Third, TPI argues that Big Data does not “discriminate against the poor.” This is where TPI critiques my thinking, and I would argue gets my argument wrong. TPI wants to have a discussion about price discrimination, which I also think is problematic, but ignores my larger point that Big Data simply presents individuals with different privacy decisions depending upon socioeconomic class. (And that doesn’t even begin to address the greater democracy issues that are presented.)
I am not an economist and I am not here to make unilateral claim that price discrimination is a bad thing, but TPI makes a number of key assumptions to conclude the Big Data-fueled price discrimination will be a win for everyone:
Price discrimination involves charging prices based on a consumer’s willingness to pay, which in general is positively related to a consumer’s ability to pay. This implies that a price discriminating firm will usually charge lower prices to lower-income consumers. Indeed, in the absence of price discrimination, some lower-income consumers would be unable or unwilling to purchase some products at all. So, contrary to arguments above, the use of big data, to the extent it facilitates price discrimination, should usually work to the advantage of lower-income consumers.
The big problem is see here is the assumption that all consumers are alike. I’ll admit to a sense of paternalism here, but in the real world, market efficiencies accrue to more sophisticated consumers. In the retail world, price discrimination will tend to benefit the higher classes and the educated. They have the knowledge and capacity to game the system. As often as TPI suggests that Big Data’s critics present arguments with no evidence, this paragraph relies on a number of inferences and implications to conclude that price discrimination should work out for the poor without addressing the point that organizations can use one class’s behavior to subsidize for another’s.
TPI accuses privacy advocates and folks like myself of “not trust[ing] the consumers for whom they purport to advocate.” Perhaps privacy advocates should admit there’s some truth to this, but Big Data advocates similarly don’t trust consumers. At a recent discussion about Big Data in education, Joel Reidenberg even conceded that when it comes to student data that parents “react based on misinformation.” Part of this is due to information asymmetries that Big Data produces. While TPI is quick to suggest that individuals know more about themselves than companies, but anyone who watched the Senate Commerce Committee’s data broker hearing last month would suggest that dynamic is changing. Further, even as TPI insinuates that information asymmetries that benefit consumers “potentially lead to a market breakdown,” the report doesn’t care if that information asymmetries benefit business so long as “consumers get what they want.”