Been spending more and more time at work trying to get a handle on the politics (and definition) of de-identification. De-identification, in short, are processes designed to make it more difficult to connect information with one’s identity. While industry and academics will argue over what exactly that means, my takeaway is that de-identification battles have become proxies for a profound lack of trust and transparency on both sides. I tried to flesh out this idea a bit, and in the process, made the mistake of wading into the world of statistics. // Read more on the Future of Privacy Forum Blog.
Information is power, as the saying goes, and big data promises the power to make better decisions across industry, government, and everyday life. Data analytics offers an assortment of new tools to harness data in exciting ways, but society has been slow to engage in a meaningful analysis of the social value of all this data. The result has been something of a policy paralysis when it comes to building consensus around certain uses of information.
Advocates noted this dilemma several years ago during the early stages of the effort to develop a Do Not Track (DNT) protocol at the World Wide Web Consortium. DNT was first proposed seven years ago as a technical mechanism to give users control over whether they were being tracked online, but the protocol remains a work in progress. The real issue lurking behind the DNT fracas was not any sort of technical challenge, however, but rather the fact that the ultimate value of online behavioral advertising remains an open question. Industry touts the economic and practical benefits of an ad-supported Internet, while privacy advocates maintain that targeted advertising is somehow unfair. Without any efforts to bridge that gap, consensus has been difficult to reach.
As we are now witnessing in conversations ranging from student data to consumer financial protection, the DNT debate was but a microcosm of larger questions surrounding the ethics of data use. Many of these challenges are not new, but the advent of big data has made the need for consensus ever more pressing.
For example, differential pricing schemes – or price discrimination – have increasingly become a hot-button issue. But charging one consumer a different price than another for the same good is not a new concept; in fact, it happens every day. The Wall Street Journal recently explored how airlines are the “world’s best price discriminators,” noting that what an airline passenger pays is tied to the type of people they’re flying with. As a result, it currently costs more for U.S. travelers to fly to Europe than vice versa because the U.S. has a stronger economy and quite literally can afford higher prices. Businesses are in business, after all, to make money, and at some level, differential pricing makes economic sense.
However, there remains a basic concern about the unfairness of these practices. This has been amplified by perceived changes in the nature of how price discrimination works. The recent White House “Big Data Report” recognized that while there are perfectly legitimate reasons to offers different prices for the same products, the capacity for big data “to segment the population and to stratify consumer experiences so seamlessly as to be almost undetectable demands greater review.” Customers have long been sorted into different categories and groupings. Think urban or rural, young or old. But big data has made it markedly easier to identify those characteristics that can be used to ensure every individual customer is charged based on their exact willingness to pay.
The Federal Trade Commission has taken notice of this shift, and begun to start a much-needed conversation about the ultimate value of these practices. At a recent discussion on consumer scoring, Rachel Thomas from the Direct Marketing Association suggested that companies have always tried to predict customer wants and desires. What’s truly new about data analytics, she argued, is that it offers the tools to actually get predictions right and to provide “an offer that is of interest to you, as opposed to the person next to you.” While some would argue this is a good example of market efficiency, others worry that data analytics can be used to exploit or manipulate certain classes of consumers. Without a good deal more public education and transparency on the part of decision-makers, we face a future where algorithms will drive not just predictions but decisions that will exacerbate socio-economic disparities.
The challenge moving forward is two-fold. Many of the more abstract harms allegedly produced by big data are fuzzy at best – filter bubbles, price discrimination, and amorphous threats to democracy are hardly traditional privacy harms. Moreover, few entities are engaging in the sort of rigorous analysis necessary to determine whether or not a given data use will make these things come to pass.
According to the White House, technological developments necessitate a shift in privacy thinking and practice toward responsible uses of data rather than its mere collection and analysis. While privacy advocates have expressed skepticism of use-based approaches to privacy, increased transparency and accountability mechanisms have been approached as a way to further augment privacy protections. Developing broad-based consensus around data use may be more important.
Consensus does not mean unanimity, but it does require a conversation that considers the interests of all stakeholders. One proposal that could help drive consensus are the development of internal review boards or other multi-stakeholder oversight mechanisms. Looking to the long-standing work of institutional review boards, or IRBs, in the field of human subject testing, Ryan Calo suggested that a similar structure could be used as a tool to infuse ethical considerations into consumer data analytics. IRBs, of course, engage in a holistic analysis of the risks and benefits that could result from any human testing project. They are also made up of different stakeholders, encompassing a wide-variety of diverse backgrounds and professional expertise. These boards also come to a decision before a project can be pursued.
Increasingly, technology is leaving policy behind. While that can both promote innovation and ultimately benefit society, it makes the need for consensus about the ethics at stake all the more important.
With the White House’s Big Data and Privacy Review anticipated any day now, I figured it was long past time to put together a quick #bigdataprivacy bingo card. If you go to enough privacy (or big data) events and workshops, you’ll quickly realize how many of the same buzzwords and anecdotes get cited over and over . . . and over again. In the battle between privacy and innovation, bingo may be the only thing that wins.
Happy to report that Truthout today published my quick op-ed entitled “Big Data’s Big Image Problem.” Not only does this piece expand on comments the Future of Privacy Forum submitted as part of the White House’s Big Data Review, but it also riffs on my favorite part of the latest Marvel movie, Captain America: The Winter Soldier. As a privacy wonk, I took great pleasure in discovering that ::minor spoilers:: Captain America’s chief villain was actually “The Algorithm.” When Captain America doesn’t like you, you know you’ve got an image problem, and frankly, big data has an image problem.
Speaking for everyone snowed-in in DC, White House Counselor John Podesta remarked that “big snow trumped big data,” while on the phone to open the first of the Obama Administration’s three big data and privacy workshops. This first workshop, which I was eager to attend (if only to continue my streak of annual appearances in Beantown), focused on advancing the “start of the art” in technology and practice. For a mere lawyer such as myself, I anticipated a lot of highly technical jargon, and in that regard I was not disappointed. // Full recap on the Future of Privacy Forum Blog.
The biggest takeaway from Common Sense Media’s School Privacy Zone Summit was, in the words of U.S. Secretary of Education Arne Duncan, that “privacy needs to be a higher priority” in our schools. According to Duncan, “privacy rules may be the seatbelts of this generation,” but getting these rules right in sensitive school environments will prove challenging. As the Family Educational Rights and Privacy Act (FERPA), one of the nation’s oldest privacy laws, turns forty this year, what seems to be apparent is that are schools lack both the resources and training necessary to even understand today’s digital privacy challenges surrounding student data.
Dr. Terry Grier, Superintendent of the Houston Independent School District, explains that his district of 225,000 students is getting training from a 5,000 student district in North Carolina. The myriad of different school districts, varying sharply in wealth and size, has made it impossible for educators to define rules and expectations when it comes to how student data can be collected and used.
Moreover, while privacy advocates charge that schools have effectively relinquished control over their students’ information, several panelists noted that we haven’t yet decided who the ultimate custodian of student data even is. One initial impulse might be to analogize education records to HIPAA health records, which belong to a patient, but Cameron Evans, CTO of education at Microsoft, suggested that it might be counterproductive to think of personalized education data as strictly comparable to individual health records. On top of this dilemma, questions about how to communicate and inform parents have proven difficult to answer as educational technology shifts rapidly, resulting in a landscape that one state educational technology director described as the “wild wild west.”
There was wide recognition by both industry participants at the summit and policymakers that educational technology vendors need to establish best practices – and soon. Secretary Duncan noted there was a lot of energy to address these issues, and that it was “in the best interest of commercial players to be self-policing.” The implication was clear: begin establishing guidelines and helping schools now or face government regulation soon.
My synopsis of Laura Donohue’s The Cost of Counterterrorism: Power, Politics, and Liberty is now up on the JustSecurity blog. A couple of quick thoughts on the book:
First, it was impossible not to read in various Snowden revelations throughout the book. It read very much like a prelude to all of the different programs and oversight problems we have learned about over the past year, which suggests that Snowden’s leaks really just confirmed what security critics were already surmising. Further, considering the book was release right at the start of the smartphone explosion and the rise of “Big Data,” it’s fascinating to see how Professor Donohue talked about the capabilities of these technologies.
Second, my major criticism of the book is that it reads like a bunch of law review articles duct-taped together. This may speak volumes for how legal scholarship is produced, or how many non-fiction books are collections that build upon a certain idea or original essay. Regardless, it was impossible not to notice how jarring portions of the book were. Professor Donohue’s overall framework is to compare the national security regimes of the United States with the United Kingdom, and this leads to chapters that bounce from the Irish Troubles to American military policy in Iraq. The comparison doesn’t always hold, and it some spots feels unwarranted.
Yesterday evening, I found myself at the Mansion on O Street, whose eccentric interior filled with hidden doors, secret passages, and bizarrely themed rooms, seemed as good as any place to hold a privacy-related reception. The event marked the beta launch of my organization’s mobile location tracking opt-out. Mobile location tracking, which is being implemented across the country by major retailers, fast food companies, malls, and the odd airport, first came to the public’s attention last year when Nordstrom informed its customers that it was tracking their phones in order to learn more about their shopping habits.
Today, the Federal Trade Commission hosted a morning workshop to discuss the issue, featuring representatives from analytics companies, consumer education firms, and privacy advocates. The workshop presented some of the same predictable arguments about lack of consumer awareness and ever-present worries about stifling innovation, but I think a contemporaneous conversation I had with a friend better highlights some of the privacy challenges mobile analytics presents. Names removed to predict privacy, of course!
A recent paper by the Technology Policy Institute takes a pro-business look at the Big Data phenomenon, finding “no evidence” that Big Data is creating any sort of privacy harms. As I hope to lay out, I didn’t agree with several of the report’s findings, but I found the paper especially interesting as it critiques my essay from September’s “Big Data and Privacy” conference. According to TPI, my “inflammatory” suggestion that ubiquitous data collection may harm the poor was presented “without evidence.” Let me first say that I’m deeply honored to have my writing critiqued; for better or worse, I am happy to have my thoughts somehow contribute to a policy conversation. That said, while some free market voices applauded the report as a thoughtful first step at doing a a Big Data cost-benefit analysis, I found the report to be one-sided to its detriment.
As ever in the world of technology and law, definitions matter, and neither myself nor TPI can adequately define what “Big Data” even is. Instead, TPI suggests that Big Data phenomenon describes the fact that data is “now available in real time, at larger scale, with less structure, and on different types of variables than previously.” If I wanted to be inflammatory, I would suggest this means that personal data is being collected and iterated upon pervasively and continuously. The paper then does a good job of exploring some of the unexpected benefits of this situation. It points to the commonly-lauded Google Flu Trends as the posterchild for Big Data’s benefits, but neglects to mention the infamous example where Target was able to uncover a teenage customer was pregnant before her family.
At that point, the paper looks at several common privacy concerns surrounding Big Data and attempts to debunk them. Read More…
The arrival of new technologies in the field of education, from connected devices, student longitudinal data systems, and massive open online courses (MOOCs) present both opportunities and potential privacy risks for students and educators. As part of my work at the Future of Privacy Forum, I have started surveying the issue of privacy in education, and early, anecdotal conversations suggest a pressing need for more education and awareness among all stakeholders. With that in mind, I was pleased to see the Electronic Privacy Information Center (EPIC) host an informative discussion on education records and student privacy.
The focus of the discussion was on the growing “datafication” of student’s personal information. Sen. Edward Markey (D-Mass), who has been active in the field of children’s privacy, opened the event with an introduction to the topic area. In addition to discussing his Do Not Track Kids legislation, which would extend COPPA-type protections to 13, 14, and 15 year-olds, the Senator highlighted his new student privacy legislation. The goals of the legislation were explained as follows:
- Student data should never be available for commercial purposes (focus on advertising);
- Parents should have access and rectification rights to data held by private companies, similar to what is afforded for records held by schools;
- Safeguards should be put in place to ensure that there are real protections for student records held by third parties; and
- Private companies must delete information that they no longer need. Student records should not be held permanently by companies, only by parents.
The panel itself featured Marc Rotenberg and Khaliah Barnes of EPIC; Kathleen Styles, Chief Privacy Officer at the Department of Education (DOE); Joel Reidenberg of Fordham Law School; Deborah Peel of Patient Privacy Rights; and Pablo Molina, Chief Information Officer at Southern Connecticut State University.