The current social and economic context increasingly demands open data to improve scientific research and decision making. However, when published data refer to individual respondents, disclosure risk limitation techniques must be implemented to anonymize the data and guarantee by design the fundamental right to privacy of the subjects the data refer to. Disclosure risk limitation has a long record in the statistical and computer science research communities, who have developed a variety of privacy-preserving solutions for data releases. This Synthesis Lecture provides a comprehensive overview of the fundamentals of privacy in data releases focusing on the computer science perspective. Specifically, we detail the privacy models, anonymization methods, and utility and risk metrics that have been proposed so far in the literature. Besides, as a more advanced topic, we identify and discuss in detail connections between several privacy models (i.e., how to accumulate the privacy guarantees they offer to achieve more robust protection and when such guarantees are equivalent or complementary); we also explore the links between anonymization methods and privacy models (how anonymization methods can be used to enforce privacy models and thereby offer ex ante privacy guarantees). These latter topics are relevant to researchers and advanced practitioners, who will gain a deeper understanding on the available data anonymization solutions and the privacy guarantees they can offer.
The Complete Book of Data Anonymization: From Planning to Implementation supplies a 360-degree view of data privacy protection using data anonymization. It examines data anonymization from both a practitioner's and a program sponsor's perspective. Discussing analysis, planning, setup, and governance, it illustrates the entire process of adapting an
How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. Create anonymization solutions diverse enough to cover a spectrum of use cases Match your solutions to the data you use, the people you share it with, and your analysis goals Build anonymization pipelines around various data collection models to cover different business needs Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs Examine the ethical issues around the use of anonymized data
Proceedings of SPIE present the original research papers presented at SPIE conferences and other high-quality conferences in the broad-ranging fields of optics and photonics. These books provide prompt access to the latest innovations in research and technology in their respective fields. Proceedings of SPIE are among the most cited references in patent literature.
Paperback. This publication deals with data protection and data access in the social sciences. The first part consists of reports from ten countries, covering country-specific legislation, and discussing problems and solutions concerning data access for research purposes. Subjects considered include practical examples of new methods to give access to machine readable data files, and the implications of privacy legislation and data protection for social science research. The second part consists of an international bibliography on the subject.The reports and bibliography form an update to the subject of data protection and data access for research at a time that overall computerization of personal information has become a reality and many countries have revised their legislation on privacy and data access.
Privacy Protection and Advertising in a Networked World
Updated as of August 2014, this practical book will demonstrate proven methods for anonymizing health data to help your organization share meaningful datasets, without exposing patient identity. Leading experts Khaled El Emam and Luk Arbuckle walk you through a risk-based methodology, using case studies from their efforts to de-identify hundreds of datasets. Clinical data is valuable for research and other types of analytics, but making it anonymous without compromising data quality is tricky. This book demonstrates techniques for handling different data types, based on the authors’ experiences with a maternal-child registry, inpatient discharge abstracts, health insurance claims, electronic medical record databases, and the World Trade Center disaster registry, among others. Understand different methods for working with cross-sectional and longitudinal datasets Assess the risk of adversaries who attempt to re-identify patients in anonymized datasets Reduce the size and complexity of massive datasets without losing key information or jeopardizing privacy Use methods to anonymize unstructured free-form text data Minimize the risks inherent in geospatial data, without omitting critical location-based health information Look at ways to anonymize coding information in health data Learn the challenge of anonymously linking related datasets