Difference between revisions of "Differential privacy"

From Simson Garfinkel
Jump to navigationJump to search
m
 
Line 1: Line 1:
__NOTOC__
__NOTOC__
A few references on Differential Privacy, for people who don't want to get bogged down with the math.
I am writing an open-access, general-interest book about differential privacy. The book will be submitted to the publisher in January 2024 for publishing in late 2024 or early 2025.


[https://trends.google.com/trends/explore?date=2006-01-01%202019-04-27&geo=US&q=%22differential%20privacy%22 View Differential Privacy on Google Trends]
You can help!


==And the US Census Bureau==
* Do you have something you want included in the book? Please [https://forms.gle/gRqyD4XsssofrkJD9 let me know].
There are now official decision memos! Two, dated July 1, 2019 cover the only official statements on the design and application of Differential Privacy to the 2020 Census that have been made to date.  
* Join the [https://groups.google.com/g/slg-dp-book-announce Google Group Announcement List] for information about the DP book.


* https://www.census.gov/programs-surveys/decennial-census/2020-census/planning-management/memo-series.html 
==Other DP Resources==
Here are a few references on Differential Privacy, for people who don't want to get bogged down with the math.
 
* [https://trends.google.com/trends/explore?date=2006-01-01%202019-04-27&geo=US&q=%22differential%20privacy%22 View Differential Privacy on Google Trends]


The memo https://www2.census.gov/programs-surveys/decennial/2020/program-management/memo-series/2020-memo-2019_13.pdf states the Group Quarters invariant, which is "number and type of GQ facilities."
==Relevant 2020 Census References==


==Introduction==
* https://www.census.gov/programs-surveys/decennial-census/2020-census/planning-management/memo-series.html 
* https://www2.census.gov/programs-surveys/decennial/2020/program-management/memo-series/2020-memo-2019_13.pdf 


===Text Materials===
==Text Materials==
* [https://github.com/frankmcsherry/blog Frank McSherry's blog]. Especially his [https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md 2016 post, Differential privacy for dummies.]  
* [https://github.com/frankmcsherry/blog Frank McSherry's blog]. Especially his [https://github.com/frankmcsherry/blog/blob/master/posts/2016-02-03.md 2016 post, Differential privacy for dummies.]  


Line 22: Line 26:
* [https://www.infoq.com/articles/differential-privacy-intro/ An Introduction to Differential Privacy], by [https://www.linkedin.com/in/charlie-cabot-55803385/ Charlie Cabot]
* [https://www.infoq.com/articles/differential-privacy-intro/ An Introduction to Differential Privacy], by [https://www.linkedin.com/in/charlie-cabot-55803385/ Charlie Cabot]


=== Podcasts ===
== Podcasts ==
* [https://www.sciencefriday.com/person/cynthia-dwork/ Cynthia Dwork on Science Friday], Crowdsourcing Data, While Keeping Yours Private. 12 minutes.
* [https://www.sciencefriday.com/person/cynthia-dwork/ Cynthia Dwork on Science Friday], Crowdsourcing Data, While Keeping Yours Private. 12 minutes.


Line 29: Line 33:
* [https://www.nist.gov/blogs/taking-measure/differential-privacy-qa-nists-mary-theofanos NIST Differential Privacy Video and Q&A with Mary Theofanos, August 8, 2019]
* [https://www.nist.gov/blogs/taking-measure/differential-privacy-qa-nists-mary-theofanos NIST Differential Privacy Video and Q&A with Mary Theofanos, August 8, 2019]
* [https://www.ias.edu/events/differential-privacy Four Facets of Differential Privacy], Differential Privacy Symposium, Institute for Advanced Study, Princeton, Saturday, November 12. A series of talks by Cynthia Dwork, Helen Nissenbaum, Aaron Roth, Guy Rothblum, Kunal Talwar, and Jonathan Ullman. View all on the [https://www.youtube.com/watch?v=Rs06sAJ07Go&feature=youtu.be&list=PLdDZb3TwJPZ7Ug5Ydu1j9V1m_RgtW7C9_ IAS YouTube channel].
* [https://www.ias.edu/events/differential-privacy Four Facets of Differential Privacy], Differential Privacy Symposium, Institute for Advanced Study, Princeton, Saturday, November 12. A series of talks by Cynthia Dwork, Helen Nissenbaum, Aaron Roth, Guy Rothblum, Kunal Talwar, and Jonathan Ullman. View all on the [https://www.youtube.com/watch?v=Rs06sAJ07Go&feature=youtu.be&list=PLdDZb3TwJPZ7Ug5Ydu1j9V1m_RgtW7C9_ IAS YouTube channel].
* [https://www.youtube.com/watch?v=ekIL65D0R3o Katrina Ligett, California Institute of Technology], explains big data and differential priacy. December 17, 2013.
* [https://www.youtube.com/watch?v=ekIL65D0R3o Katrina Ligett, California Institute of Technology], explains big data and differential priacy. December 17, 2013.
* [https://www.youtube.com/watch?v=OfWj89oRD7g Cynthia Dwork explains Differential Privacy], August 11, 2016. 86 minutes
* [https://www.youtube.com/watch?v=OfWj89oRD7g Cynthia Dwork explains Differential Privacy], August 11, 2016. 86 minutes
* [https://www.youtube.com/watch?v=Gx13lgEudtU Christine Task at Purdue] teachs the CERIAS Security Seminar on Differential Privacy, May 1, 2012. (40 min)
* [https://www.youtube.com/watch?v=Gx13lgEudtU Christine Task at Purdue] teachs the CERIAS Security Seminar on Differential Privacy, May 1, 2012. (40 min)
* [https://youtu.be/rfI-I3e_LFs SIGMOD 2017 Tutorial Part 1 ( 2 - 3:30pm)]
* [https://youtu.be/rfI-I3e_LFs SIGMOD 2017 Tutorial Part 1 ( 2 - 3:30pm)]
* [https://youtu.be/Uhh7QCbnE9o SIGMOD 2017 Tutorial Part 2 (4 - 5:30 pm)]
* [https://youtu.be/Uhh7QCbnE9o SIGMOD 2017 Tutorial Part 2 (4 - 5:30 pm)]
Line 41: Line 41:
* 2019-01-28: [https://www.youtube.com/watch?v=3ksa-e_501w Rice University Symposium on Data Privacy, Simson Garfinkel on Differential Privacy]
* 2019-01-28: [https://www.youtube.com/watch?v=3ksa-e_501w Rice University Symposium on Data Privacy, Simson Garfinkel on Differential Privacy]


===Database Reconstruction===
==Database Reconstruction==
The idea of that releasing multiple queries on a confidential database could result in the reconstruction of the confidential database goes back to the 1970s.
The idea that releasing multiple queries on a confidential database could result in the reconstruction of the confidential database goes back to the 1970s.


We explain how to perform database reconstruction in our 2018 ACM Queue article:
We explain how to perform database reconstruction in our 2018 ACM Queue article:
Line 61: Line 61:
So the only way to protect against a large number of unaudited queries is to add noise to the database. The proof in Dinur and Nissim is that adding noise protects against *all* queries, random and otherwise. The more noise, the more protection.
So the only way to protect against a large number of unaudited queries is to add noise to the database. The proof in Dinur and Nissim is that adding noise protects against *all* queries, random and otherwise. The more noise, the more protection.


 
==Textbook==
===Textbook===


* [https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf The Algorithmic Foundations of Differential Privacy] (2014), a textbook by Cynthia Dwork and Aaron Roth. The first two chapters are understable by a person who doesn't have an advanced degree in mathematics or cryptography, and it's free!
* [https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf The Algorithmic Foundations of Differential Privacy] (2014), a textbook by Cynthia Dwork and Aaron Roth. The first two chapters are understable by a person who doesn't have an advanced degree in mathematics or cryptography, and it's free!


===Foundational Papers===
==Critical Papers==
==Foundational Papers==
* [http://www.cse.psu.edu/~ads22/privacy598/papers/dn03.pdf Revealing Information while Preserving Privacy], Dinur and Nissim 2003.
* [http://www.cse.psu.edu/~ads22/privacy598/papers/dn03.pdf Revealing Information while Preserving Privacy], Dinur and Nissim 2003.


Line 73: Line 73:
* [http://www.cse.psu.edu/~sxr48/pubs/smooth-sensitivity-stoc.pdf Smooth Sensitivity]
* [http://www.cse.psu.edu/~sxr48/pubs/smooth-sensitivity-stoc.pdf Smooth Sensitivity]


==Critical Papers==
===Mechanisms===
===Mechanisms===
* [http://www.cse.psu.edu/~ads22/pubs/NRS07/NRS07-full-draft-v1.pdf Smooth Sensitivity and Sampling in Private Data Analysis, 2007]
* [http://www.cse.psu.edu/~ads22/pubs/NRS07/NRS07-full-draft-v1.pdf Smooth Sensitivity and Sampling in Private Data Analysis, 2007]
Line 87: Line 86:
* [http://repository.cmu.edu/jpc/vol7/iss3/1/ How Will Statistical Agencies Operate When All Data Are Private?], John M. Abowd, U.S. Census Bureau, Journal of Privacy and Confidentiality: Vol. 7 : Iss. 3 , Article 1.
* [http://repository.cmu.edu/jpc/vol7/iss3/1/ How Will Statistical Agencies Operate When All Data Are Private?], John M. Abowd, U.S. Census Bureau, Journal of Privacy and Confidentiality: Vol. 7 : Iss. 3 , Article 1.


==Existing Applications==
==Applications==


===On The Map, at the US Census Bureau===
===On The Map, at the US Census Bureau===
Line 100: Line 99:
===Apple===
===Apple===
* 2016-06: [https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/ Andy Greenberg's article in Wired about Apple's Differential Privacy]
* 2016-06: [https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/ Andy Greenberg's article in Wired about Apple's Differential Privacy]
==Advanced Topics==





Latest revision as of 08:54, 17 August 2023

I am writing an open-access, general-interest book about differential privacy. The book will be submitted to the publisher in January 2024 for publishing in late 2024 or early 2025.

You can help!

Other DP Resources

Here are a few references on Differential Privacy, for people who don't want to get bogged down with the math.

Relevant 2020 Census References

Text Materials

Podcasts

Videos

Database Reconstruction

The idea that releasing multiple queries on a confidential database could result in the reconstruction of the confidential database goes back to the 1970s.

We explain how to perform database reconstruction in our 2018 ACM Queue article:

This article summarizes the risks of database reconstruction, as understood in 1989:

I learned of the connection from Dorothy Denning's work on The Tracker:

Dinur and Nissim's "Database Reconstruction Theory" is actually a proof that random queries on a database, which can be generated with complexity P, will reveal the full contents of the database:

But query auditing was shown to be NP-hard in 2000:

  • J. M. Kleinberg, C. H. Papadimitriou and P. Raghavan, Auditing Boolean Attributes, PODS 2000

So the only way to protect against a large number of unaudited queries is to add noise to the database. The proof in Dinur and Nissim is that adding noise protects against *all* queries, random and otherwise. The more noise, the more protection.

Textbook

Critical Papers

Foundational Papers

Mechanisms

Public Perception

Philosophy

Applications

On The Map, at the US Census Bureau

RAPPOR, in Google Chrome

Uber

Apple


Differential Privacy and Floating Point Accuracy

Floating point math is not continuous, and differential privacy implementations that assume it is may experience a variety of errors that result in privacy loss. A discussion of the problems inherently in floating-point arithmetic can be found in Oracle's What Every Computer Scientist Should Know About Floating-Point Arithmetic, an edited reprint of the paper What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg, published in the March, 1991 issue of Computing Surveys.

"How Will Statistical Agencies Operate When All Data Are Private?" (MS #1142) has been published to Journal of Privacy and Confidentiality. http://repository.cmu.edu/jpc/vol7/iss3/1

The Fool's Gold Controversy

Other attacks

Math

p for randomized response rate:

$p = \frac{e^\epsilon}{1+e^\epsilon}$

Probability that randomized response should be flipped.

See Also