MIT Media Lab, E14-633
Metadata, the breadcrumbs inadvertently left behind by technology, give us unprecedented perspectives on individuals and societies. Researchers have compared the recent availability of large-scale behavioral datasets to the invention of the microscope. However, while metadata have great potential for good in research and beyond, their collection and broad use raise privacy concerns. These metadata contain rich information on somebody’s whereabouts, social life, preferences, and finances. They can be used for good but can also be abused. There are obvious benefits to the use of metadata datasets, but this first requires a solid quantitative understanding of their privacy. The notion of anonymity has long been central to finding the balance between the utility of the data and its privacy: the so-called privacy-utility trade-off. In the first part of the thesis, Yves-Alexandre de Montjoye will show that the rise of rich behavioral metadata datasets puts into question our reliance on anonymity as one of the primary approaches to data protection. In the second part of this thesis, he will argue that assessing the sensitivity of specific data and the potential risks of collecting and using it has always been a hard exercise. He will then argue that the exercise is even harder for metadata. Indeed, with metadata, one needs not only to consider what is readily available about an individual in the data but also what could be inferred about him/her from his/her metadata now or in the future. In the last part of the thesis, he will argue that, while a perfect solution is unlikely to ever exist, technical and non-technical solutions can help strongly limit the risks. Here, he will discuss two such technical solutions: interactive systems (openPDS/SafeAnswers) and privacy-conscientious anonymization (Orange D4D challenge).
Alex 'Sandy' Pentland, Gary King, Alessandro Acquisti