Difference between revisions of "Statistical Disclosure Control"

From Simson Garfinkel
Jump to navigationJump to search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
There are two main approaches to SDC: ''principles-based'' and ''rules-based.''<ref name=":0">{{Cite journal|last=Ritchie|first=Felix, and Elliott, Mark|date=2015|title=Principles- Versus Rules-Based Output Statistical Disclosure Control In Remote Access Environments|url=http://www.iassistdata.org/sites/default/files/iqvol_39_2_ritchie.pdf|journal=IASSIST Quarterly v39 pp5-13|doi=|pmid=|access-date=March 2016}}</ref> In principles-based systems, disclosure control attempts to upload a specific set of fundamental principles---for example, "no person should be identifiable in released microdata." Rules-based systems, in contrast, are evidenced by a specific set of rules that a person performing disclosure control follows, after which the data are presumed to be safe to release. Using this taxonomy, proposed by Ritchie and Elliot in 2013, disclosure control based on [[differential privacy]] can be seen as a principles-based approach, whereas controls based on de-identification, such as the US [https://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act Health Insurance Portability and Accountability Act]'s Privacy Rule's Safe Harbor method for de-identifying [https://en.wikipedia.org/wiki/Protected_health_information Protected health information] can be seen as a rule-based system.
==Presentations==
==Presentations==
* [http://washingtonstatisticalsociety.org/presentations/20150421/wisniewski_21apr15.pdf Disclosure Avoidance Methods and Research at the U. S. Census Bureau], Billy Wisniewski, Amy Lauger, and Laura McKenna, Center for Disclosure Avoidance, Research (CDAR), U.S. Census Bureau, April 21, 2015
* [http://washingtonstatisticalsociety.org/presentations/20150421/wisniewski_21apr15.pdf Disclosure Avoidance Methods and Research at the U. S. Census Bureau], Billy Wisniewski, Amy Lauger, and Laura McKenna, Center for Disclosure Avoidance, Research (CDAR), U.S. Census Bureau, April 21, 2015


* [http://washingtonstatisticalsociety.org/presentations/20150421/wang_21apr15.pdf eConfidentiality - a Disclosure Avoidance Application System (Proposed)], Bei Wang
* [http://washingtonstatisticalsociety.org/presentations/20150421/wang_21apr15.pdf eConfidentiality - a Disclosure Avoidance Application System (Proposed)], Bei Wang, U.S. Census Bureau, April 21, 2015
U.S. Census Bureau, April 21, 2015
 
==How-to Guides==
* [https://www.hhs.gov/sites/default/files/spwp22.pdf Statistical Policy Working Paper 22], Federal Committee on Statistical Methodology, Originally Prepared by Subcommittee on Disclosure Limitation Methodology 1994 Revised by Confidentiality and Data Access Committee 2005
 
* [https://s3.amazonaws.com/sitesusa/wp-content/uploads/sites/242/2017/08/CDAC_2017_Nissim_and_Woodv2.pdf Differential Privacy: A Primer for a non-technical auidence], Kobbi Nissim and Alexandra Wood, Presented at the CDAC 2017 Workshop on New Advances in Disclosure Limitation, Sept. 27, 2017


==Papers==
==Papers==
===US Census Bureau===
Below are papers that the US Census Bureau has written on statistical Disclosure Control


*  [http://www.census.gov/srd/papers/pdf/rrs2007-21.pdf “Examples of Easy-to-implement, Widely Used Masking Methods for which Analytic Properties are not Justified,”]  Winkler, W. E. (2007b),  
*  [http://www.census.gov/srd/papers/pdf/rrs2007-21.pdf “Examples of Easy-to-implement, Widely Used Masking Methods for which Analytic Properties are not Justified,”]  Winkler, W. E. (2007b),  
* [https://www.census.gov/srd/CDAR/cdar2014-02_Discl_Avoid_Techniques.pdf Disclosure Avoidance Techniques at the U.S. Census Bureau: Current Practices and Research], Amy Lauger, Billy Wisniewski, and Laura McKenna, Center for Disclosure Avoidance Research, RESEARCH REPORT SERIES #2014-02, U.S. Census Bureau, September 26, 2014.
* [https://www.census.gov/srd/CDAR/cdar2014-02_Discl_Avoid_Techniques.pdf Disclosure Avoidance Techniques at the U.S. Census Bureau: Current Practices and Research], Amy Lauger, Billy Wisniewski, and Laura McKenna, Center for Disclosure Avoidance Research, RESEARCH REPORT SERIES #2014-02, U.S. Census Bureau, September 26, 2014.
* [https://www.census.gov/srd/CDAR/rrs2005-06_DisclAvoid_Practices.pdf Disclosure Avoidance Practices and Research at the U.S. Census Bureau: An Update], Laura Zayatz, Statistical Research Division, RESEARCH REPORT SERIES, (Statistics #2005-06), Revised August 31, 2005
* [https://www.census.gov/srd/CDAR/rrs2005-06_DisclAvoid_Practices.pdf Disclosure Avoidance Practices and Research at the U.S. Census Bureau: An Update], Laura Zayatz, Statistical Research Division, RESEARCH REPORT SERIES, (Statistics #2005-06), Revised August 31, 2005
* [http://www.census.gov/econ/census/help/methodology_disclosure/disclosure.html Disclosure], U.S. Census Bureau Economic Census website, Last Revised: March 05, 2015
* [http://www.census.gov/econ/census/help/methodology_disclosure/disclosure.html Disclosure], U.S. Census Bureau Economic Census website, Last Revised: March 05, 2015
* [https://www.census.gov/srd/papers/pdf/rrs2009-10.pdf Disclosure Avoidance for Census 2010 and American Community Survey Five-year Tabular Data Products ], Laura Zayatz, Jason Lucero, Paul Massell, Asoka Ramanayake, RESEARCH REPORT SERIES (Statistics #2009-10) , November 23, 2009
* [https://www.census.gov/history/pdf/ConfidentialityMonograph.pdf A Monograph on Confidentiality and Privacy in the U.S. Census],  George Gatewood, US Census Policy Office, July 2001.
* [https://fcsm.sites.usa.gov/files/2014/05/E3_Massell_2013FCSM.pdf A Disclosure Avoidance Research Agenda], Paul B. Massell, Center for Disclosure Avoidance Research, U.S. Census Bureau, May 2014.
*  Winkler, W. E. (2008), “General Discrete-data Modeling Methods for Producing Synthetic Data with Reduced Re-identification Risk that Preserve Analytic Properties,” IAB Workshop on Confidentiality and Disclosure, http://fdz.iab.de/en/FDZ_Events/SDC-Workshop.aspx, Nuremberg, Germany, November 20-21, 2008 (also http://www.census.gov/srd/papers/pdf /rrs2010-02 ).
*  Winkler, W. E. (2010), “General Discrete-data Modeling Methods for Creating Synthetic Data with Reduce Re-identification Risk that Preserve Analytic Properties,” http://www.census.gov/srd/papers/pdf/rrs2010-02.pdf .
*  Winkler, W.E. (2013c), Cleanup and Analysis of Sets of National Files, Federal Committee on Statistical Methodology, Proceedings of the Bi-Annual Research Conference, http://www.copafs.org/UserFiles/file/fcsm/J1_Winkler_2013FCSM.pdf.,  https://fcsm.sites.usa.gov/files/2014/05/J1_Winkler_2013FCSM.pdf


* [https://www.census.gov/srd/papers/pdf/rrs2009-10.pdf Disclosure Avoidance for Census 2010 and American Community Survey Five-year Tabular Data Products ], Laura Zayatz, Jason Lucero, Paul Massell, Asoka Ramanayake, RESEARCH REPORT SERIES (Statistics #2009-10) , November 23, 2009
===Review Articles===
* Fienberg, Stephen, "Confidentiality and Disclosure Limitation," Encyclopedia of Social Measurement, Volume 1, 2005. '''A good overview article about statistical disclosure limitation, not too much math. No mention of differential privacy, of course.'''
===Critiques===
Many contemporary statistical disclosure control techniques, such as generalization and cell suppression, have been shown to be vulnerable to attack by a hypothetical data intruder. For example, Cox showed in 2009 that Complementary cell suppression typically leads to "over-protected" solutions because of the need to suppress both primary and complementary cells, and even then can lead to the compromise of sensitive data when exact intervals are reported.<ref>Lawrence H. Cox, Vulnerability of Complementary Cell Suppression to Intruder Attack, Journal of Privacy and Confidentiality (2009) 1, Number 2, pp. 235–251 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1017</ref>


* [https://www.census.gov/history/pdf/ConfidentialityMonograph.pdf A Monograph on Confidentiality and Privacy in the U.S. Census],  George Gatewood, US Census Policy Office, July 2001.


* [https://fcsm.sites.usa.gov/files/2014/05/E3_Massell_2013FCSM.pdf A Disclosure Avoidance Research Agenda], Paul B. Massell, Center for Disclosure Avoidance Research, U.S. Census Bureau, May 2014.
==References==
<references/>

Latest revision as of 06:11, 27 September 2017

There are two main approaches to SDC: principles-based and rules-based.[1] In principles-based systems, disclosure control attempts to upload a specific set of fundamental principles---for example, "no person should be identifiable in released microdata." Rules-based systems, in contrast, are evidenced by a specific set of rules that a person performing disclosure control follows, after which the data are presumed to be safe to release. Using this taxonomy, proposed by Ritchie and Elliot in 2013, disclosure control based on differential privacy can be seen as a principles-based approach, whereas controls based on de-identification, such as the US Health Insurance Portability and Accountability Act's Privacy Rule's Safe Harbor method for de-identifying Protected health information can be seen as a rule-based system.


Presentations

How-to Guides

  • Statistical Policy Working Paper 22, Federal Committee on Statistical Methodology, Originally Prepared by Subcommittee on Disclosure Limitation Methodology 1994 Revised by Confidentiality and Data Access Committee 2005

Papers

US Census Bureau

Below are papers that the US Census Bureau has written on statistical Disclosure Control

Review Articles

  • Fienberg, Stephen, "Confidentiality and Disclosure Limitation," Encyclopedia of Social Measurement, Volume 1, 2005. A good overview article about statistical disclosure limitation, not too much math. No mention of differential privacy, of course.

Critiques

Many contemporary statistical disclosure control techniques, such as generalization and cell suppression, have been shown to be vulnerable to attack by a hypothetical data intruder. For example, Cox showed in 2009 that Complementary cell suppression typically leads to "over-protected" solutions because of the need to suppress both primary and complementary cells, and even then can lead to the compromise of sensitive data when exact intervals are reported.[2]


References

  1. Template:Cite journal
  2. Lawrence H. Cox, Vulnerability of Complementary Cell Suppression to Intruder Attack, Journal of Privacy and Confidentiality (2009) 1, Number 2, pp. 235–251 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1017